Problems with Branching by Feature
Back in January I wrote a blog on Branching by Feature. In this article I want to talk about some problems with branching by feature as well as talk about some other approaches. We’ve been doing branching by feature since a bit before the previous article was written and although some things are working out, there are definitely some disadvantages.
Becoming Overly Cautious
The idea of branching by feature is that features aren’t committed into the trunk until they are complete and fully tested. This sounds great, but it leads to some bad behaviors. People get overly cautious about merging their feature back into the trunk. People want to have a perfect merge and just keep delaying it and delaying it. Further the people using the trunk tend to resist merges and keep delaying them asking for more testing or more reviews.
This then leads to all the features being off on separate branches. But what if you are doing something that requires two of these features? You have no way to get them together. Then you have build servers continuously building all these branches, automated testing servers testing them and various other infrastructure being tied up.
Lack of Continuous Integration
To me the main drawback of branch by feature is a lack of continuous integration. Any bad interactions by the outstanding features are not found until much later. A lot of times when these problems are found, people claim it’s due to a lack of testing on the branch and that it was merged too early. But generally these problems couldn’t be discovered until the merge happens.
I tend to find that merge by feature just prolongs integration testing so long that a lot of serious problems are found so much later. Generally the longer between when a bug is introduced and when it’s discovered makes fixing it that much harder and more disruptive. If things are left too long then people have moved on to other projects and don’t like the distraction of going back.
Reducing Merge Hell
Another problem is that the longer things remain on branches, the more work it is to merge them back into the trunk. You can minimize this by continuously merging the trunk back into your branch. But then you have the overhead of continuously managing any conflicts. Plus there is a lot of room for error in this process. Every time you are resolving conflicts in a merge, you have the possibility of making a mistake and erasing someone else’s changes or introducing a bug.
This can lead to an extreme case of having to resolve hundreds of merge conflicts which always leads to errors and worse some pretty extreme conflicts between people or teams that have messed up each other’s code.
For an ERP package like Sage 300, they are composed of many modules like G/L, A/P, A/R, I/C, O/E, P/O, etc. We can source control the whole thing in one repository. This has the advantage that you get everything you need by extracting everything in the one repository. This can be quite convenient. However when you create branches when you merge them back in you do run the risk of conflicting with things that you didn’t expect. Often people just push those conflicts they don’t understand with their own changes. This usually then overwrites someone else’s work.
Currently we have everything together in one big repository, but we are going to break it up into separate repositories, one for each Accounting Module, one for System Manager, one for language translated strings, one for documentation, etc. This way we reduce the number of branches we need and we also reduce the danger of affecting things on too global a scale.
Reducing the number of branches greatly reduces the amount of complexity in the whole process. It also simplifies the process of merging features into the trunk.
This does mean there may be features that can’t be committed atomically as one transaction, it will require two commits in two separate repositories. However we feel that keeping down the scope of the commits is more important than strictly maintaining this atomicity.
You could do your branches at a lower level in the source code tree, but then if you find you need something elsewhere, it’s a fair bit of work to re-branch. This generally leads to people just including everything in their branch which then leads to all the merge problems.
Merge to Trunk Quicker
We are also working to get the different groups and teams to merge features back into trunk much quicker. We make it clear that we do expect to find problems, but that we want to find these earlier and have no expectation that all problems be found and fixed on feature branches. This way we can run the main full set of automated tests off of trunk and don’t need farms of automated test machines testing every branch.
Also for the consumers of these features, they will be all together quicker for people to use off the main builds. This way you won’t need to install multiple branch build to test multiple features.
As we move more and more into a cloud mode of software delivery, major releases become a thing of the past. In fact for cloud services we don’t really talk about releases anymore. We really just have a live service that is being continuously updated. To do this we need a good continuous delivery infrastructure and a reliable mechanism to develop individual features and merge them to trunk for immediate integration, full automated testing and then deployment to the live cloud service.
As we’ve been on this journey we keep tweaking our branching and build procedures to achieve this goal. Our first branching strategy was progress, but still led to a lot of problems that we are addressing with this new strategy.