Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree with everything said, but I think they assumed a well-maintained and highly functionality legacy codebase. In my experience, there are a few steps before any of those.

---

1. Find out which functionality is still used and which functionality is critical

Management will always say "all of it". The problem is that what they're aware of is usually the tip of the iceberg in terms of what functionality is supported. In most large legacy codebases, you'll have major sections of the application that have sat unused or disabled for a couple of decades. Find out what users and management actually think the application does and why they're looking to resurrect it. The key is to make sure you know what is business critical functionality vs "nice to have". That may happen to be the portions of the application that are currently deliberately disabled.

Next, figure out who the users are. Are there any? Do you have any way to tell? If not, if it's an internal application, find someone who used it in the past. It's often illuminating to find out what people are actually using the application for. It may not be the application's original/primary purpose.

---

2. Is the project under version control? If not, get something in place before you change anything.

This one is obvious, but you'd be surprised how often it comes up. Particularly at large, non-tech companies, it's common for developers to not use version control. I've inherited multi-million line code bases that did not use version control at all. I know of several others in the wild at big corporations. Hopefully you'll never run into these, but if we're talking about legacy systems, it's important to take a step back.

One other note: If it's under any version control at all, resist the urge to change what it's under. CVS is rudimentary, but it's functional. SVN is a lot nicer than people think it is. Hold off on moving things to git/whatever just because you're more comfortable with it. Whatever history is there is valuable, and you invariably lose more than you think you will when migrating to a new version control system. (This isn't to say don't move, it's just to say put that off until you know the history of the codebase in more detail.)

---

3. Is there a clear build and deployment process? If not, set one up.

Once again, hopefully this isn't an issue.

I've seen large projects that did not have a unified build system, just a scattered mix of shell scripts and isolated makefiles. If there's no way to build the entire project, it's an immediate pain point. If that's the case, focus on the build system first, before touching the rest of the codebase. Even for a project which excellent processes in place, reviewing the build system in detail is not a bad way to start learning the overall architecture of the system.

More commonly, deployment is a cumbersome process. Sometimes cumbersome deployment may be an organizational issue, and not something that has a technical solution. In that case, make sure you have a painless way to deploy to an isolated development environment of some sort. Make sure you can run things in a sandboxed environment. If there are organizational issues around deploying to a development setup, those are battles you need to fight immediately.



I don't completely understand your warning to stick with the existing version control environment. Just because you switch development to git doesn't mean you delete the old CVS archive. Isn't consulting the old archive sufficient whenever you're doing a significant historical investigation?


There are a couple of reasons I'd argue it's best to avoid switching version control environments early on.

1. Integration with whatever build/issue tracking systems are present is worth preserving until you have the time to recreate it properly.

Duplicating what's already there under the new environment is always more problematic than it looks like at first glance. This is especially true when you're dealing with any in-house components (which usually manage to show up somewhere).

2. A clean break where you leave the old VCS behind and archived is tempting, but it's rarely ideal in the long-term.

The old archive is likely to wind up being deleted/lost/bitrotted/etc after a year or two. Invariably, you wind up in a spot a few years down the line where it would be useful to have the full commit history, and the old VCS winds up being inaccessible. Ideally, you'd want to preserve as much history as possible when migrating. However, trying to correctly preserve commit history (and associated issue tracker info, etc) is always a time-sink, in my experience. It's easy for simple projects, and a real pain for complex projects with a weird, long history. Choose the time that you attempt it wisely.

---

Again, I'm not saying don't move, I'm just saying that it almost always winds up taking a lot of time and effort. I'd argue you're better off spending that time and effort on other portions of the project early on.

Also, things like git-svn can be real lifesavers in some of these cases, though they do add an extra layer of complexity. If you do want to use a different VCS, I'd take the git-svn/etc approach until you're sure there are are no extra integration problems.

All that said, yeah, if there's no history and no integration with other systems/tools, go straight for something modern!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: