Software Branching and Parallel Universes

We use Perforce at my shop, and have a semi-unique system due to the way our product (embedded software in lots of hardware) is. We’ve got a few configurations, all similar, all to different customers, and the customers are the type that want to know what the changes are if they get a new version, and don’t want anything too different than the last one, even if it’s “better”.

We rely on branching to let us do our development for a specific project on a “development” branch, maintain a “release” branch for each project, and have a “mainline” where we try to collect all the latest things from every single project, so that when we start a new project, we have the sum total of all the newest stuff that we’ve done.

This relies on good developers communicating, as well as making sure that they’re always adding features. Feature B gets added to module 1, so module 1 has Features A and B, rather than module 1 being changed to be B not A.

Our revision graphs in Perforce are truly amazing things to behold, but it works well, and we’ve support numerous products at varying states of release on different hardware with a small number of developers with a minumum of chaos.

“Eclipse has had local history for years, and Netbeans 6 does as
well. :)”

It helps, but it’s not as helpful as a local repository (centralized
or distributed). It doesn’t help smooth painful merges or give
anyone else the ability to view your change history.

yes, local history is not a replacement for GIT (or other distributed version control solutions). But my point was that local code can and should be versioned in some way.

It’s amazing how convenient being able to look over the diffs in your code from today can be. :slight_smile:

Using an automated build tool can help ease the pain of using branches. It is also important to merge often, not based on time, but on the amount of coding being done.

We’ve previously explored the topic of branching not unfortunately have not come across a concrete strategy due to the nature of our development, but this post has certainly got us thinking, probably along the task-orientated way.

I would ask though, if anyone has any advice or experience on branching strategy for an example situation:
10+ web sites, each using common core functionality across many class libraries. When a change request comes in for WebsiteA and it is branched (let’s say WebsiteA-Change1056) a task-branch works great.
Our strategy has faltered when as part of the change 5 of the common libraries also have to be updated and improved… they need a branch also in this scenario… but this could occur for ALL changes and could very easily get ugly.

Defining a good consistent branching strategy seems harder when you right at the start and currently using the unbroken line of development on shared common code.

Jeff, you forgot to include the “Mainline Model”, Perforce’s recommended branching practice:

http://www.perforce.com/perforce/bestpractices.html

It’s like your “Branch per Release” (which Perforce calls the “Promotion Model”), except the branch you create is an archive of the released Version N. The Mainline marches ever forward to Version N+1.

And as Mike Dimmick mentioned above, you can label releases on your Mainline. You can branch retroactively only if you actually need to create a hotfix or service pack (e.g. Version 1.0.1).

The Mainline Model is also nice because you never have to freeze the Mainline. When you have a Release Candidate, you can label/branch it and continue fixing bugs in the Mainline. If you find critical bugs in the Release Candidate, you can fix them in that branch.

You hate Branching And Merging

But You Can’t Live Without Branching And Merging.

It’s a Black Art and every software project needs it. It’s very difficult to get right, because it’s difficult to forecast how a project will look like in a year. The complexity increases as the number of products * the number of separate branches are needed * the number of releases that needs to be maintained * number of components shared between branches/products.

=== Branch/Merge Hell.

Kashif

Re: biscuit’s question about branching multiple common libraries for a single change.

In our organization, we run into a similar issue. We follow a mainline model, and branch-per-task.

The way we have addressed it is to branch all of the libraries which might need changing into a working folder together with the main application.

Thus, we might take code from multiple trunks:
/cpp/lib/lib1/trunk
/dotnet/lib/lib2/trunk
/dotnet/app/MyApp/trunk

and branch each one to a working folder:
/dotnet/app/MyApp/branches/mybranch/lib1
/dotnet/app/MyApp/branches/mybranch/lib2
/dotnet/app/MyApp/branches/mybranch/MyApp

Thus, any work that requires changing multiple libraries can be handled in a single branch of the app, and merged back to it’s multiple trunk locations when the branch work is done.

We use an ant script per app to take care of coordinating branching/merging the required libraries along with the app.

Since most changes don’t require modifying the common libraries, it means we branch more than we probably need to, but it allows us to change multiple projects as part of a single change set when we need to.

I’d say a better term for Big Bang is a href="http://en.wikipedia.org/wiki/Big_Crunch"Big Crunch/a. Here you can draw an analogy between the size of the universe and the number of branches in existance. It

How exactly do these distributed version control systems make merging easier, though? I think merging two people’s independent work is hard no matter how you slice it. This sounds awfully silver bullet-ish to me. Can someone explain this better?

@Jeff, did anyone ever explain this to you? I’ve watched Linus’ git video, and the Mercurial google talk too. I’ve also used Mercurial locally, but not with other developers. I’m starting to see the benefit and paradigm shift of DVCS, but I’d like to hear an experienced person’s explanation. Make me a believer!

In the company I previously worked at, we were using VSS. It was ok for small projects. And we had typically one database per project. And we had to develop so many periferal tool (Bug Tracking, Daily Building, Version Building, etc.).

Where I work know, we have been using TFS for 2 years. We tested SVN and even began the deployment before shifting reverse and going for TFS. SVN Tortoise were not bad at all, but just not as complete integrated than TFS. On the other hand TFS is costly (much $$$) but I guess not more than Perforce or other commercial stuff.

Also, I am now using branches with TFS. Every major version gets its own branch from dev (1.0, 1.1, 1.2, 2.0). Minor revisions go in the same branches (1.1R0+1.1R1+1.1R2, 1.2R0+1.2R1). We found this a good balance between branches merges. It saved us a few times when the dev root had many new features not ready for any kind of release, but still needing a version NOW with a bug fix or a new small thing.

The coolest thing with TFS is that it keeps track of what was changed and not merget yet. I got complex merge sequences and still it was able to keep track of this. I should say, to be honnest, that once TFS losts its ‘pointers’ and lost track of my merge state… But I was really really doing scary and not-catholic things, I guess I diserved it.

For the first 30-40 merges I was paranoiac and checked what TFS did to the files. Now I am more confident. I typically review only merges that had conflicts resolution.

The other nice thing with TFS is that it integrates bug tracking development tasks. I can find, which merges are required for which bug. This is great. And NO, TFS is not based on VSS. This is all new stuff.

We use ClearCase. We have a main branch used for releases. When we start a project phase, we create a new branch for it. The phase branches are merged to the main branch for each release. Every software change (enhancement, bug fix, etc) is done in a separate development branch and merged to the phase branch.

Although we do have some merging problems (mainly when a dev branch has been in use for too long), this setup definitely works fine for us and provides enough isolation and flexibility for concurrent programming.

We manage to keep everything under control (and to keep antipatterns off) using these techniques:

-Automated branch generation. This keeps a naming convention and keeps the branches linked to the change request that originated them. We don’t have misterious branches and we always know what a branch is for and who is working on it.

-Automated builds and merges. The build includes automated unit and integration tests and a mere is done at the end automatically if everything is OK. Nothing gets merged the build breaks and that prevents having those problems due to programmer mistakes.

-Automated releases. Everything is locked and tagged automatically.

-Each project has an administrator. This is not a full-tim job, I would said it take 20% of his/her time, and we rotate that responsibility around. The job is basically to serialize the tasks that require it, mainly the releases, and to move whatever cannott be automated in the life cycle of change requests.

-This is a very important point: our change requests are usually very limited in scope. We have many change requests but each is small. One of the positive consequences this has is that most merges are automatic.

I am not saying this is a perfect setup and of course I don’t think it is the best for all situations. Nevertheless, it has a bunch of ideas that could be of general use because they keep the complexity in check. These ideas could be summarized as:

-Reduce the universe of options while taking advantage of source control branching.
-Have a consistent branch-merge cycle.

I was not involved in creating this setup and I understand it took a lot of effort to get it to this point.

Don’t let changes to accumulate before you merge.

Brillant stuff!

Good article.

I’m surprised to see only one reference to Accurev. Streams are a bit different to think about at first but well worth the effort as they make branching/merging dead simple. Subversion is imho not suited to parallel development and should be avoided if you anticipate ever needing more than one line of code.

Mike

Even for long lived branches (for example feature branch for experimental feature which takes number of commits / revisions and long time to develop) one can avoid Bing Bang Merge at least in one of those ways:

  1. By doing merge of trunk (mainline) branch into feature branch, resolving conflicts (if needed); if version control system make it possible sometimes it might be better to do trial merge, i.e. either discard merge at all, or record merge as ‘mainline compatibility fixup’ commit but not as a proper merge.

  2. By doing trial merge of feature branch to trunk; discarding it later. SCM must make correcting and removing latest, not yet published commit possible.

  3. By rebasing feature branch on top of mainline, just like it was a series of patches to apply on top of new mainline (see description of git-rebase in Git documentation for example).

If one says that branching is bad then perhaps SCM one used does not have easy branching (and merging).

I’m in a similar boat as Mike Dimmick. Four developers, all sharing a very large C code base for a financial application. 85% legacy code ( 15 years old), 15% “newer” code.

Anything experimental isn’t checked in until it’s at least proven harmless to the build and the basic functionality of the code. We are frequently in each other’s code working on overlapping modules and features.

Long-term development projects are either done alongside of existing features and not “activated” until needed or are replacements for existing functionality and don’t interfere until configured. Testing is very thorough and automated and done once a month whether needed or not – though all done at the application level. Hardly any unit tests exist.

Once a month, on Staging Day, we branch into Release X from Main. Then the automated tests are run against Release X. If there are bugs found in the Release, we patch it, confirm the fix, and then merge just that change back into the Main branch. When the Release is delivered, we basically abandon that branch forever unless an emergency fix is needed during the month (rare).

We’ve done 60 releases over 6 years so far without problems. It’s an effective system for a shop our size.

Those alternate timeline branches are childsplay compared to Primer (http://www.imdb.com/title/tt0390384/). Check out this timeline chart: http://www.freeweb.hu/neuwanstein/primer_timeline.jpg

I’m going to go out on a limb and say that’s not too huge of a spoiler, because it’s really hard to follow even if you’ve seen the movie. The thing that’s cool about it is that the writer’s thought through things like avoiding pesky time travel questions most people gloss over. Example: if you went back in time 24 hours, you’d be floating somewhere in outer space since the earth would have moved on.

As to software branching, it got to be a big deal at a previous job at a financial company. We used StarTeam, which was pretty hard to use but did handle branching well. I agree with everyone else who’s suggested Subversion - it’s by far the best source control I’ve ever used, and just happens to be free.

Complex branching scenarios start to get complex enough that they require a substantial amount of time and planning. If you count all the time required to handle different branches (including meeting hours), I’d say that the development team of ~30 devoted the equivalent of 2 or 3 full time resources to managing it.

There are many times when it’s preferable to work off the same branch and use configuration, resource dependencies, and other software tools to solve the problem rather than trying to branch and merge via source control.

Oh, more lightweight source control solutions like tagging (http://svnbook.red-bean.com/en/1.1/ch04s06.html) and shelving (http://blogs.vertigosoftware.com/teamsystem/archive/2006/01/18/Shelving_and_Branching.aspx - on your Vertigo blog) are often preferable to making an entire branch to separate a feature.

I think the real trick here is knowing when to branch and merge. Branching works great, its always the merge that gets you. Try and have developers not to work on the same code files between branches, if possible. If not, sometimes the automerge fails and manual merge is required. This can sometimes be painful, especially if you are doing the manual merge and did not implement the changes. Even with manual merge, tools are decent to help you out identifier and merge changes (GUIFFY, FileDiff, etc.)

Jeff: merging gets hard when the merges are big. Small merges are almost always easy, because there’s little there to conflict. Without history in a local checkout, you have a big pile of changes that get applied at once (at pulling from the mainline into your local checkout is essentially a merging the main lines’s changes into your changes.) Most distributed version control systems take the tack of making the local checkout a full-blown repository, giving you all the revision control goodness for local changes, before they’re committed to the main line. This encourages you to break all the changes you are working into individual changes, so only the actual conflicts conflict, rather than accidental conflicts. But even without this history, the merger is just a degenerate case of one change getting merged.