I originally posted this to AltDevBlogADay on Friday 15th July 2011.
Having a Continuous Integration server running is one of the most useful and powerful tools a development team can use. Constantly checking the state of the code, building assets which might otherwise take hours and generating stats on build quality are all really useful things to have running in the background hour after hour and day after day.
But if it’s not done with care, a CI process, while still providing some useful information, will stop being an important part of a development teams tool set.
The main problems usually stem from a single CI step taking to long. For example, it might take hours to build all the game assets or it might take 40 minutes to build a single configuration. You might have additional build steps (like copying files to a network drive) which can take quite a while if you’re dealing with gigabytes of data.
As soon as a CI step takes to long, you lose the main benefit – the fast turn around of information.
For example, when we first start a project, our simple CI process will consist of the following steps
- Detect modification
- Build code
- Build game assets
- Copy to network drive
- E-mail developers (who’s mailed depends on success or failure)
This is fine as a new project is tiny and the whole process doesn’t even take 5 minutes. And we need to build the whole thing constantly as we’re adding so much content the artists and designers need to be on the bleeding edge of what the programmers are creating.
But after a month (or probably less) this stops being suitable. A whole build might start taking 20 minutes then 30, an hour and then two hours, and if we leave it as is, the programmers are not getting the benefit of continuous turnaround and the designers spend ages waiting for a new build.
So what can we do about it?
The first thing is to look at what the CI process is doing, and exactly what we want to get out of it.
- Continuous Build – We need the process to constantly compile the source, all configs, all platforms. This is so we can detect any compile errors quickly without having to build everything manually.
- ‘Designer’ Builds – Creating an executable the designers, artists, animators etc. can get with the latest code changes. Ideally one they can request as required and one that is built as quickly as possible.
- Full builds – A complete build of the game including executable and all in-game assets along with anything else needed to run the game.
- QA Builds – QA could use the full build if needed, but this is an additional step which packages the build as it would be submitted allowing a better QA pass (DVD emulation, submission content etc.).
From my point of view, those are the four main things I want to get out of a CI process that having a single build stepl won’t give us. You might have other requirements and I’d certainly be interested in hearing what those are.
So what can we do to try and improve the initial process and still get what we want out of our CI machine?
The first step is easy. We want a Continuous Build process with nothing to integrate, nothing to deploy and nothing to copy. This can be much quicker if we alter our repository modification checks to only monitor source code folders and not the entire repository.
For example if our repository contains scripts, configuration files or (shudder) game assets and executables we shouldn’t kick off a build if these changes as the CB results won’t be any different from last time.
We might also want to reduce the configs we’re interested in building (usually a debug and master build, the profiling builds might be skipped for speed reasons and due to them rarely being using). If we have a decent machine we might get the individual platforms (X360, PS3) to build in parallel as there will be no conflict between the temp files they generate – or even stick them on separate machines if we have the capacity.
The process only ever needs to notify on failure as no one is going to be using this build, it’s a sanity check pure and simple.
So already we have a much faster turn around time between check-in and ‘all clear’. In the past I’ve managed to reduce this from 60 minutes to 5.
Initially it might be tempting to use the results of the ‘Continuous Build’ as the aim of this step is to provide the designers and artists with a new executable to take advatage of any new features (very) recently added.
This might be the right idea at the start, when the CB process is taking less than 5 minutes, but that doesn’t last, so we need a faster, more iterative, process to make sure our non-programmers are not hanging around waiting for the latest builds.
Most of the time, designers will use a ‘release’ build (‘release’ being a bit of a misnomer – it’s not releasable in any way, but it has just enough debug information to make it useful but still run at a ‘releaseable’ frame rate). So we only need to concentrate on a single configuration which means we can drastically cut down the length of time between when a modification is detected and a new usable build is generated.
As this is the fastest CI step we have and can often have the most people dependent on it, it’s the first one to run when a modification is detected.
In our case we don’t e-mail people when a new designer build is enabled. This is being built many times in an hour, and people will just end up sending the mails directly to the trash (I know I’ve done that on particularly spammy CI set ups). Simply allowing them to check the build status and update when it’s green works well enough.
Developers generate a lot of content for games. Even small games can balloon in size depending on the scope and quality of the final product. As a result, we need a full integration build for a number of reasons
- It would take every member of the team far to long to rebuild all the assets themselves
- When a build gets out of sync with the assets, developers need a quick way to get everything back on track
- When testing the build it needs to have been built on an independent build machine
A full build could be brute force (just builds everything every time) or smart (it concurrently builds executables and assets on multiple machines). This really depends on how long a full build would take. Less than an hour and I personally stick with a brute force approach but any longer and a more intelligent build step would be needed.
Full builds always e-mail the entire team, since it’s rare they happen. This allows people to get latest as they become available (usually at the start of the day) without having to check the status of the build.
The QA build is a special build. It doesn’t rebuild any assets or executables and is automatically kicked off when the daily build has finished successfully. This step packages the build up as it would be presented as part of a final submission along with any submission assets that would be required.
But why not just use the full build as the QA build to save time?
Simply put when testing a build it’s vital that we test under the same situation that our final submission will be tested under. Making sure we run under DVD emulation and use the same assets the manufacturer will use is an important part of the process. Having our CI machine generate these builds for us makes sure we’re doing this from the very start.
In every case we give all members of the development team the ability to request any build. If the ‘Designer Build’ is to far out of sync, they might need a full build to get them back on their feet. An asset change might alter the executable but not trigger a ‘Designer Build’ so a designer might need to trigger one manually.
In our case using CCTray allows us to do this very easily, so a new build can be requested by anyone (including QA) at any point of the day without any input from a programmer, allowing them to concentrate on making the game rather than just enabling others.
Technically anyone in the company could request and get a new build (very useful for getting demo builds together without getting the team involved) but I’ve not seen that happen yet.
What I Haven’t Covered
One major thing I’ve not covered here: self testing builds. This can range from simple unit tests running after every build to a scripted run through of the game after every full integration. The scope of this will very much depend on the size of the game and the time you have available to add them. Since this is a big topic in it’s own right, I’ve left that for another time.
So by simply reviewing what we actually want to get out of the Continuous Integration process we’re able to streamline it down to be much faster and much more useful. Complete integrations still happen (and happen when needed) but the common information needed by the team is generated quickly and allows teams to get short turn-around from the CI machine throughout the day.
I’d be very interested to know what other uses people are getting from their CI processes and how they are still making sure the speed and quick turnaround is happening all the time.
Title image by highgroove. Used with permission.