Check In Early, Check In Often

BTW, I agree that you should check in often, and most people (including me, probably) don’t check in often enough. I simply think your idea of more than once a day, regardless of whether you have a useful complete tested feature-unit of code is too much.

When I need to communicate with somebody else about the interface a feature should have, we’ve got whiteboards and email and lunchtime for that. Checking in empty stubs just clutters up the repository history.

I’ve seen classes in libraries that were versioned by name. I.e. class Foo is a typedef or other facade for Foo_1, Foo_2, or Foo_OtherImplementation. This means that you can completely redo the class without old code breaking. This same strategy can replace VCS branching if you don’t like you VCS’s branching.

(But really, I’ve never found dealing with branches to be as hard as some people seem to find them… as long as people are conservative and sane about editing the source code files and don’t do dumb things like redo all the intentation or whatever…)

I like to follow this rules and have always tried to do so and encourage the same from others. It doesn’t always go without conflict from other practices, of course.

With a new team I’ve been working with for a little under two weeks now, I branched for my first task (combining three search paths into one) and got to work. I found my way around a new code base and got my work done in good time. I thought I was in a good place when I handed it over.

Turns out one of the other developers was meanwhile working on a set of incompatible changes to the original search path. When he finishes, now I have to try to integrate what he did into the new path I created.

We both checked in constantly, and followed all the good rules, and neither of us were the wiser.

The same is with centralized version control systems. Developers are forced to take small steps and work closely with their fellow developers. You can’t go off in a corner and make a masterpiece of programming art. You have to work with everyone. Everyone sees everything you’re doing. You can’t hide.

I agree with David, centralized source control works great if you have good communication with your mates. We almost follow the practice exactly as he has mentioned.

As I like to say: It’s better to have a broken build in your working repository than a working build on your broken hard drive.

People who don’t commit often–or don’t understand why they would need to–usually have a less than ideal approach to writing code in my opinion. If you naturally split your work into smaller parts, committing your changes after each part is the logical thing to do. However, if you do a million things at once and code away for three weeks before you even have anything that compiles, there is something fundamentally wrong with the way you are working. Simply telling those people to commit often won’t solve the underlying problem. They need to change they way the work in a more fundamental way. In my experience, this is usually caused by a lack of conceptual understanding for version control systems and the reasons they being used in the first place. If someone doesn’t see the point in keeping the version control system up-to-date, they simply won’t do it.

Personally, I prefer to follow these basic guidelines when working with version control sysems:

  1. Put everything under version control.
  2. Create sandbox home folders.
  3. Use a common project structure and naming convention.
  4. Commit often and in logical chunks.
  5. Write meaningful commit messages.
  6. Do all file operations in the version control system.
  7. Set up change notifications.

I wrote down my thoughts on these guidelines in my blog a few weeks ago:

http://blog.looplabel.net/2008/07/28/best-practices-for-version-control/

When it comes to the issue of breaking other people’s build by committing too often or too early, this can easily be solved by using branches. There is no reason why every developer can’t have their own branch if they want to or need to.

I really like the concept, and found this short blog here very easy to picture the difference between trying to merge often with branch-based tools and a tool like Accurev:

http://daveonscm.blogspot.com/2007/09/agile-branches-vs-streams.html

Next little gem to check out is multi-stage continuous integration.

Mica

Files are guaranteed to get renamed or merged, and sometimes a file with the same name might even reappear later with different contents and function. With Git that might not be too big of a deal, but that would give our current revision control system fits.

But that’s for new development. The other common issue I have with things being broken for a long time is when I’m refactoring really messy code. Typical activity is to split a humongous file into two or more smaller files, and then spend the next several days trying move the various other little bits around so that the result will compile.

Then you should take a really hard look at changing your version control system!

Nowadays it’s easier than ever…

I don’t think that DVCS is the entire answer. Part of the point of the post is that the changes are made available to other developers. With DVCS you will still need to check in often to the main repository. Obviously if you are working on major build breaking changes then the disconnect from the main repository is welcome.

BTW, That is where DVCS shines in my opinion. I haven’t quite gotten a handle on why we aren’t all using DVCS: it offers all the benefits of traditional VCS plus some of the coolest features to ever come to version control.

In either case, new development or build breaking changes, a TEAM will cope because they are a team.

Both myself and my team are fans of the daily checkin. Saves us headache, and we all know what everyone is working on. It’s not a big hassle, and I definitely think it’s worth it.

As with anything, check-in early and often is just a best practice but isn’t applicable to all situations. The example of the device driver is a good one. Likely you’ll be the only developer working in this particular piece of code, therefore early and often merges of your code with the main branch isn’t that useful. The real benefit is a situation where multiple developers may be working on the same piece of code. In this situation it’s much less painful to merge your LAG code with the main branch in small doses than to do it all in one big lump. It’s a lot easier to manually merge a few small conflicts than an entire file of conflicts.

The other key think here is having a continuous build that you can execute post get but pre check in. In our current project our check in dance goes like this:

  1. Do some work.
  2. Get latest
  3. Execute a local one click build which compiles, configures, runs tests, etc.
  4. On a successful build, check in.
  5. Any check in initiates another automated build in the background and tests against an integrated environment.
  6. Handle broken builds IMMEDIATELY!!

The most funny thing is many things go away from us only for not checking things often

With an SCM that makes branching very easy (e.g. git, though this is orthogonal to distributed vs centralized), it’s even possible to commit broken code, just in another branch.

In several projects I’ve worked on, experimental side-branches frequently didn’t build for a long time, but it was possible to check them out and see what people were working on and thinking about.

I agree I agree. I usually set up a personal subversion for me to use to cover my butt and commit my source to the enterprise version control system when ready to integrate.

Regards!

You really love code complete dont you. I love it too .

Gee. We used to call that stepwise refinement.

These aren’t backups. They are commits. They may be commits of
works in progress, skeletons, etc., but they are you -committing-
and annotating artifacts of your process.

I agreee. That’s what I’m saying I see no value to.

These commits offer visibility to other members of your team, a
record of your thinking,

Perhaps you’d care to elaborate on why this is helpful? Perhaps a situation where you’ve seen it come in handy? Personally, I find my fellow developer’s working code to be hard enough to read. I can’t imagine ever wanting to troll through their non-working code.

and, as I mentioned above, a safety net
allowing you to roll back to a state you previously explicitly
declared useful.

Yes, but the way I do it is a much better safety net. I get every old version of my file saved automaticly, whether I thought it was important at the time or not. This is better because we aren’t allways the best judge of when we need to save off a fallback position. When you need to go back, you had to have guessed right (and thought to check in at the right time). I have it no matter what.

When you have a well-appointed development infrastructure, these interim commits offer a basis for automated testing tied to checkins.

That might be nice. However, my environment cross-compiles, and I’m typically working on device drivers. There’s no way to create useful automated tests.

I’m still not convinced it would be that useful either though. If I thought my code was good enough to pass all the automated tests, I’d probably be checking it in anyway. Alternatively, if I know its not going to pass, wouldn’t it just cause problems to check it in and break the tests?

The issue I don’t see anybody addressing is what to do about those times I work days, or even (occasionally) weeks on a load that doesn’t compile. Perhaps I’m just slower than y’all, so this only comes up for me?

I do recommend use of developer branches if you’re using a centralised version control system. (We use Perforce.) That way it’s always safe to check code into your branch, because you know it won’t cause problems for anyone else. It also makes it easier to control how others’ changes are introduced to your code. If you have a bunch of files open for edit in the trunk and somebody drops a huge set of changes on trunk, it can be very difficult to sync and resolve those changes with your changelist, and your changelist becomes quite complicated. However, if you’re working on your own branch, you can check in your changes when you’re finished, then handle the integration from trunk into your branch cleanly and separately, before finally (and trivially) propagating back to the trunk.

@T.E.D.
The issue I don’t see anybody addressing is what to do about those times I work days, or even (occasionally) weeks on a load that doesn’t compile. Perhaps I’m just slower than y’all, so this only comes up for me?

The idea is that you are not supposed to work for weeks without being able to compile.

My solution would be to make the code compile more often, i.e. by splitting the work into smaller parts. Note that a successful compile does not mean that it has to do anything meaningful, you could implement stubs and empty interfaces for the pieces that don’t have any useful code yet.

When working with auto-compiling environments, such as Eclipse, this is not really an issue most of the time anyway.

I just use a nightly backup script that copies any files marked modified compared to the repository to a shared network location that is in turn backed up nightly to permanent media.

Personally, I find the annoyance of having to fight against the constant stream of implementation bugs of half finished code to be a perpetual headache.

I have no problem with people checking into a repository branch, but checking tiny little changes into the trunk causes everyone else to have to constantly checkout, resolve conflicts, and rebuild their code and if it is within the section of code that everything else is dependent on, this can cause a huge amount of time to be spent in almost perpetual rebuilding of the entire source tree. Not very pleasant.

Just saw The Shining again last night. There wouldn’t have been any problems if Jack Nicholson checked in his writing every day, then Shelley Duvall would have noticed that it was all All work and no play makes jack a dull boy. before they were snowed in.