Using long-lived stable branches

For the last couple of years I’ve been using subversion on all the commercial software projects I’ve done. At joost and after that at the BBC we’ve usually used long-lived stable branches for most of the codebases. Since I cannot find a good explanation of the pattern online I thought I’d write up the basics.

Working on trunk

Imagine a brand new software project. There’s two developers: Bob and Fred. They create a new project with a new trunk and happily code away for a while:

Example flow diagram of two developers committing to trunk

Stable branch for release

Flow diagram of two developers creating a stable branch to cut releases

At some point (after r7, in fact) the project is ready to start getting some QA, and its Bob’s job to cut a first release and get it to the QA team. Bob creates the new stable branch (svn cp -r7 ../trunk ../branches/stable, resulting in r8). Then he fixes one last thing (r9), which he merges to stable (using svnmerge, r10). (Not paying much attention to the release work, Fred’s continued working and fixed a bug in r11) Bob then makes a tag of the stable branch (svn cp brances/stable tags/1.0.0, r12) to create the first release.

QA reproduce the bug Fred has already fixed, so Bob merges that change to stable (r14) and tags 1.0.1 (r15). 1.0.1 passes all tests and is eventually deployed to live.

Release branch for maintenance

Flow diagram of creating a release branch for hosting a bug fix

A few weeks later, a problem is found on the live environment. Since it looks like a serious problem, Bob and Fred both drop what they were doing (working on the 1.1 release) and hook up on IRC to troubleshoot. Fred finds the bug and commits the fix to trunk (r52), tells Bob on IRC, and then continues hacking away at 1.1 (r55). Bob merges the fix to stable (r53) and makes the first 1.1 release (1.1.0, r54) so that QA can verify the bug is fixed. It turns out Fred did fix the bug, so Bob creates a new release branch for the 1.0 series (r56), merges the fix to the 1.0 release branch (r57) and tags a new release 1.0.2 (r58). QA run regression tests on 1.0.2 and tests for the production bug. All seems ok so 1.0.2 is rolled to live.

Interaction with continuous integration

Flow diagram showing what continuous integration projects use what branch

Every commit on trunk may trigger a trunk build. The trunk build has a stable period of just a few minutes. Every successful trunk build may trigger an integration deploy. The integration deploy has a longer stable period, about an hour or two. It is also frequently triggered manually when an integration deploy failed or deployed broken software.

Ideally the integration deploy takes the artifacts from the latest successful trunk build and deploys those, but due to the way maven projects are frequently set up it may have to rebuild trunk before deploying it.

Every merge to stable may trigger a stable build. The stable build also has a stable period of just a few minutes, but it doesn’t run as frequently as the trunk build simply because merges are not done as frequently as trunk commits. The test deploy is not automatic – an explicit decision is made to deploy to the test environment and typically a specific version or svn revisions is deployed.

Reflections

Main benefits of this approach

  • Reasonably easy to understand (even for the average java weenie that’s a little scared of merging, or the tester that doesn’t touch version control at all).
  • Controlled release process.
  • Development (on trunk) never stops, so that there is usually no need for feature branches (though you can still use them if you need to) and communication overhead between developers is limited.
  • Subversion commit history tells the story of what actually happened reasonably well.

Why just one stable?

A lot of people seeing this might expect to see a 1.0-STABLE, 1.1-STABLE, and such and so forth. The BSDs and mozilla do things that way, for example. The reason not to have those comes down to tool support – with a typical svn / maven / hudson / jira toolchain, branching is not quite as cheap as you’d like it to be, especially on large crufty java projects. It’s simpler to work with just one stable branch, and you can often get away with it.

From a communication perspective it’s also just slightly easier this way – rather than talk about “the current stable branch” or “the 1.0 stable branch”, you can just say “the stable branch” (or “merge to stable”) and it is not ever ambiguous.

Why a long-lived stable?

In the example above, Bob and Fred have continued to evolve stable as they worked on the 1.1 release series – for example we can see that Bob merged r46,47,49 to stable. When continuously integrating on trunk, it’s quite common to see a lot of commits to trunk that in retrospect are best grouped together and considered a single logical change set. By identifying and merging those change sets early on, the story of the code evolution on stable gives a neat story of what features were code complete when, and it allows for providing QA with probably reasonably stable code drops early on.

This is usually not quite cherry-picking — it’s more likely melon-picking, where related chunks of code are kept out of stable for a while and then merged as they become stable. The more coarse-grained chunking tends to be rather necessary on “agile” java projects where there can be a lot of refactoring, which tends to make merging hard.

Why not just release from trunk?

The simplest model does not have a stable branch, and it simply cuts 1.0.0 / 1.0.1 / 1.1.0 from trunk. When a maintenance problem presents itself, you then branch from the tag for 1.0.2.

The challenge with this approach is sort-of shown in these examples — Fred’s commit r13 should not make it into 1.0.1. By using a long-lived stable branch Bob can essentially avoid creating the 1.0 maintenance branch. It doesn’t look like there’s a benefit here, but when you consider 1.1, 1.2, 1.3, and so forth, it starts to matter.

The alternative trunk-only approach (telling Fred to hold off committing r13 until 1.0 is in production) is absolutely horrible for what are hopefully obvious reasons, and I will shout at you if you suggest it to me.

For small and/or mature projects I do often revert back to having just a trunk. When you have high quality code that’s evolving in a controlled fashion, with small incremental changes that are released frequently, the need to do maintenance fixes becomes very rare and you can pick up some speed by not having a stable branch.

What about developing on stable?

It’s important to limit commits (rather than merges) that go directly to stable to an absolute minimum. By always committing to trunk first, you ensure that the latest version of the codebase really has all the latest features and bugfixes. Secondly, merging in just one direction greatly simplifies merge management and helps avoid conflicts. That’s relatively important with subversion because its ability to untangle complex merge trees without help is still a bit limited.

But, but, this is all massively inferior to distributed version control!

From an expert coders’ perspective, definitely.

For a team that incorporates people that are not all that used to version control and working with multiple parallel versions of a code base, this is very close to the limit of what can be understood and communicated. Since 80% of the cost of a typical (commercial) software project has nothing to do with coding, that’s a very significant argument. The expert coders just have to suck it up and sacrifice some productivity for the benefit of everyone else.

So the typical stance I end up taking is that those expert coders can use git-svn to get most of what they need, and they assume responsibility for transforming their own many-branches view back to a trunk+stable model for consumption by everyone else. This is quite annoying when you have three expert coders that really want to use git together. I’ve not found a good solution for that scenario; the cost of setting up decent server-side git hosting is quite difficult to justify even when you’re not constrained by audit-ability rules.

But, but this is a lot of work!

Usually when explaining this model to a new group of developers they realize at some point someone (like Bob) or some people will have to do the work of merging changes from trunk to stable, and that the tool support for stuff like that is a bit limited. They’ll also need extra hudson builds and worry a great deal how on earth to deal with maven’s need to have the version number inside the pom.xml file.

To many teams it just seems easier to avoid all this branching mess altogether, and instead they will just be extra good at their TDD and their agile skills. Surely it isn’t that much of a problem to avoid committing for a few hours and working on your local copy while people are sorting out how to bake a release with the right code in it. Right?

The resolution usually comes from the project managers, release managers, product managers, and testers. In service-oriented architecture setups it can also come from other developers. All those stakeholders quickly realize that all this extra work that the developers don’t really want to do is exactly the work that they do want the developers to do. They can see that if the developers spend some extra effort as they go along to think about what is “stable” and what isn’t, the chance of getting a decent code drop goes up.

PUT *heart* MySQL

MySQL has a non-standard extension called INSERT ... ON DUPLICATE KEY UPDATE. When you’re building a RESTful application, it really rocks, because you can use it quite easily to have just the right PUT semantics. For example, you can have queries such as

INSERT INTO Book
           (id, deleted, a, b, c)
    VALUES (#id#, 0, #a#, #b#, #c#)
    ON DUPLICATE KEY UPDATE
        a = #a#, b = #b#, c=#c#,
        deleted = 0,
        revision = revision + 1

which you might call from java (example uses iBatis) using something like

    /* PUT a book, return an HTTP status code or throw an exception */
    public int putBookPart(final BookPart bookPart) throws SQLException {
        final int affected = sqlMapClient.update("putBook", book);
        assert affected == 1 || affected == 2;
        return 202 - affected;
    }

Someone at MySQL was clever enough to hijack the affected value – it will be 1 if this was an insert (resulting in 201 Created) or 2 if it was an update (resulting in 200 OK)). I don’t know about you, but to me that looks pretty sweet!

In this simple case, there is only one SQL query that happens to do this PUT, and so I don’t even have to bother with transactions.

Oh, just realized there’s one other trick in this example: the revision column is int(11) UNSIGNED NOT NULL DEFAULT 1 which is the basis of my ETag.

Forward references for RESTful resource collections

I could use some help with a design problem for a RESTful API.

The APIs we’re trying to do are those for a media production process with MXF and AAF as interchange formats. Data comes out of a database to go into complex long-running processes that slice and dice the data, eventually coming back to merge into the database. That database itself is replicated across half a dozen sites in an eventually consistent pattern, and connected up in various ways to other (enterprise) databases. Because the full complexity of these media formats gets in the way of designing the API basics I’ve come up with a simpler example. The weirdness of the example comes from it being distilled out of the complex use cases, where it does make (some) sense.

Setting the scene

Imagine a library of digital books. The library for reasons of storage efficiency and others has ripped all the books apart and has stored the individual chapters. When you are searching through the library or fetching bits of content, you interact with a representation of the books and the chapters (like a virtual index card) that does not include their content.

So books consist of 0 or more chapters, chapters are part of one or more books. Chapters can be part of multiple books, really. This happens because The collected works of William Shakespeare is represented as all the chapters from all of his books stitched together.

Both books and chapters have 0 or more titles (usually one title per language but there are various also known as edge cases).

Browsing through books

Imagine we represent a book as

<book xmlns="http://schemas.example.com/library/v1/" id="urn:uuid:084E014B-784D-41AE-9EF6-01CE202B5EDA" href="/library/book/084E014B-784D-41AE-9EF6-01CE202B5EDA">
  
<title xml:lang="en-GB">The Merchant of Venice</title>
  
<title xml:lang="nl">De Koopman van Venetië</title>
  
<chapters>
    
<chapter id="urn:uuid:B24B6A07-7E48-4C61-B10F-FE13CCE7B20E" href="/library/chapter/B24B6A07-7E48-4C61-B10F-FE13CCE7B20E">
      
<title xml:lang="en-GB">FIRST ACT</title>
      
<title xml:lang="nl">EERSTE BEDRIJF</title>
    
</chapter>
  
</chapters>
</book>


and a chapter as

<chapter xmlns="http://schemas.example.com/library/v1/" id="urn:uuid:B24B6A07-7E48-4C61-B10F-FE13CCE7B20E" href="/library/chapter/B24B6A07-7E48-4C61-B10F-FE13CCE7B20E">
  
<title xml:lang="en-GB">FIRST ACT</title>
  
<title xml:lang="nl">EERSTE BEDRIJF</title>
  
<book id="urn:uuid:084E014B-784D-41AE-9EF6-01CE202B5EDA" href="/library/book/084E014B-784D-41AE-9EF6-01CE202B5EDA">
    
<title xml:lang="en-GB">The Merchant of Venice</title>
    
<title xml:lang="nl">De Koopman van Venetië</title>
  
</book>
</chapter>

It’s hopefully obvious that you can do a GET /library/{book|chapter}/{uuid} to retrieve these representations.

Changing book metadata

It’s also not difficult to imagine that you can do a PUT to the same URL to update the resource. You just PUT the same kind of document back.

What is a bit difficult is what happens when you do that PUT. The logic that I want is that a PUT of a book can be used to change the titles for that book and change which chapters are part of that book. For a PUT of a chapter, that should be used to change the titles for the chapter, but not to add or remove the chapter from a book (the list of chapters is ordered and the chapter doesn’t know where it is in the ordering).

(Again these rules seem pretty artificial in the example but in MXF there’s a variety of complex constraints that dictate in many cases that a new UMID should be created if an object in the model changes in a way that matters)

This sort-of breaks the PUT contract, because no matter how often you GET a book document, change the title of a chapter inside the book, and PUT that changed representation, your change will not be picked up. You have to follow the href, get the representation for the chapter, change the title there, and PUT it back.

This also breaks the common expectation people have with XML documents — if the data is there and you edit it and then you save it, normal things happen.

The problem with minimal representations

It’s easy to minimize the representations in use so this problem goes away. For example,

<chapter xmlns="http://schemas.example.com/library/v1/" id="urn:uuid:B24B6A07-7E48-4C61-B10F-FE13CCE7B20E">
<title xml:lang="en-GB">FIRST ACT</title>
<title xml:lang="nl">EERSTE BEDRIJF</title>
<book href="/library/book/084E014B-784D-41AE-9EF6-01CE202B5EDA" />
</chapter>


It’s clear what you’re dealing with. The PUT does what it is supposed to do, and to learn the book title you just do another GET.

The problem with this approach is that the number of HTTP requests grows much larger if you want to display something in the UI, because the visual representation of a chapter shows the book title. To build snappy UIs that use ajax to communicate with my service, the rich representation that has the title information is much better.

Some options

So what should I do?

Use multiple representations

I could have /library/{book|chapter}/{uuid}/annotated as well as /library/{book|chapter}/{uuid}, with the latter serving the minimal representation and supporting PUT, or if I had smart ajax clients (I don’t) I could use some kind of content negotiation to get to the rich annotated version.

This is rather a bit of work and when documents leave the web for some kind of offline processing (the AAF files go into a craft edit suite and come back very different many weeks later, but they will still reference some of my original data) I have a risk that the “wrong” document makes into that edit suite.

Document the situation

I could stick with my original richly annotated XML and simply document which fields are and aren’t processed when you do a PUT. I’d probably change the PUT to a POST to make it a bit clearer.

Document and enforce the situation

I could strongly validate all documents that are PUT to me to make sure they do not contain any elements (in my namespace) that I do not intend to save, and reject documents that

Document the situation inside the XML

I could do something like

<chapter xmlns="http://schemas.example.com/library/v1/" id="urn:uuid:B24B6A07-7E48-4C61-B10F-FE13CCE7B20E" href="/library/chapter/B24B6A07-7E48-4C61-B10F-FE13CCE7B20E">
  
<title xml:lang="en-GB">FIRST ACT</title>
  
<title xml:lang="nl">EERSTE BEDRIJF</title>
  
<referencedBy>
      
<!-- please note that referencedBy contents
           cannot be changed through PUT 
-->
      
<book id="urn:uuid:084E014B-784D-41AE-9EF6-01CE202B5EDA" href="/library/book/084E014B-784D-41AE-9EF6-01CE202B5EDA">
        
<title xml:lang="en-GB">The Merchant of Venice</title>
        
<title xml:lang="nl">De Koopman van Venetië</title>
      
</book>
  
</referencedBy>
</chapter>


This way it’s hopefully quite obvious to the API consumer what is going to happen when they PUT a document back. It is still rather unclean REST (so should I use POST?), but it avoids me having to design separate representations for browse vs edit.

One disadvantage is that I have to keep more resource state around when parsing or generating the content. Not an issue when things are built-up in memory, but for large documents and/or for pipeline processing, I made life a lot harder. There’s other possibilities to alleviate this (like adding an isReference attribute or inlining referencedBy sections throughout the document rather than put them all at the bottom), but they’re even less please esthetically.

Something else?

Which approach do you think is best? Is there a better one? What would you do?

Right now, since I’m just doing some quick prototyping, I’ve gone for the “document the situation” approach, but I think that eventually I’d either like to somehow highlight the “this is a forward reference for your convenience but don’t edit it” bits of the XML, or go for the multiple representations approach.