The Evils of Unity Builds

Unity builds. I don’t like them.

Of all the tools at your disposal to make a build faster, this is the worst. And it’s not just the “hey let’s #include .cpp files” weirdness, but the way that it can make a well structure, modular code base become worse than spaghetti junction at rush hour, and the worse thing is that it’s not the fault of the programmer, especially when something as exceptional as Visual Assist can start helping you create non-modular code because of the way Unity Builds collect the code you write.

What Is a Unity Build?
Unity Builds have nothing to do with the excellent Unity Tool Set. I’ll just clear that up right off the bat.

These Unity Builds are a way of reducing build over-head (specifically opening and closing files and reducing link times by reducing the number of object files generated) and as such are used to drastically speed up build times. I say drastically because they are usually used after a code base has starting generating build times that totally cut down on a programmer’s productivity. This might be 10 minutes, 30 minutes or even hours.

Because Unity Builds give you a quick gain, it’s seen as a pretty magical solution, and when a full build is reduced from 30 to 3 minutes you can see why. But the caveat in that statement? Full builds.

I won’t go through how to generate Unity Builds here as the blog post The Magic of Unity Builds does a great job so have a look there. It’s interesting that I’m linking to a blog post that is in the total opposite direction that I’m coming from, but unless you know both sides of a tale, you don’t know the tale.

So without any more delay, why don’t I like them?

Say Goodbye to Local Properties
Well written code is modular, and modular code relies on a few assumptions. One of those being that a single translation unit (a cpp file and any associated include files) is isolated from other translation units, so (unless you extern the contents) anything defined within a cpp file is not available outside, and as a consequence you can have local objects being called the same things without conflict.

Take for example the following simple code (easily expanded to a real world example I’m sure)

// In VectorTest.cpp
namespace
{
   const uint StandardTestCount = 10;
}
// In ListTest.cpp
namespace
{
   const uint StandardTestCount = 3;
}

In a normal compilation environment these variables are local to the file they are working in, and this is a good thing. Why should the vector tests file care what is defined within another test file?

But if you have a Unity Build with the following then you’re going to have issues…

#include "VectorTest.cpp"
#include "ListTest.cpp"
#include "ArrayTest.cpp"
#include "QueueTest.cpp"

Specifically errors relating to the variable ::StandardTestCount already being defined. So we now have to be aware of what is defined throughout the entire project and resolve any issues, and inevitable this will end up with all variables being pre-appended with (in this example) VectorTest_ and ListTest_ etc.

I don’t want to refer to the ‘Magic’ post to much, as it’s a good article, but there is a statement in there referring directly to this. Specifically the following

“Technically you shouldn’t have this problem if you’re writing “proper” code”

This is wrong. “Proper” code is well structured and well maintained, meaning it’s modular and only refers to what it needs to refer to. In a lot of cases you will have variables with the same name, functions with the same name and other elements that you naturally write as you go along. That’s why the anonymous namespace (or the C style ‘static’) is so useful, and so useless if you switch to Unity Builds.

Using Is (even more) Lethal
Never use the ‘using’  keyword in a header file. It’s a standard statement and one that stands up easily. The reasons behind this are pretty clear (declaring ‘using’ in a header file forces all code including the header file to use that statement – in effect wiping out a namespace).

But using it in a .cpp file isn’t a crime. In fact it’s remarkably useful, even within the whole files scope. As a developer, I should know what I’m using in a local file, what elements I’m bringing into the file and what would/could conflict. Some people might not agree that ‘using’ at the top of a cpp file is a good idea at all, and I see their point, but on a whole file scope, it can be massively useful and as it’s localised it’s not going to pollute anything other than the file it’s in.

But in Unity Builds, using (that is not scoped within a function or type etc.) is lethal. Declaring ‘using ‘ in a cpp file makes it available throughout the code base (or more confusingly only in those files that are included after the one declaring it). Suddenly, using is everywhere. And your namespaces are now useless.

Every Change is a Rebuild
This is one of my biggest problems with Unity Builds. In well structured projects (more on that below) changing the content of a cpp file (or even a local header file) shouldn’t require everything to be rebuilt. Every. Single. Time. Yes, the unity build rebuilds are faster than doing a full build without a unity file, but doing even a fast rebuild every time will build up. Quickly.

When-ever I change a .cpp file, all I need to build is that .cpp file. Granted, link times are a tad long because it still has to re-link the other object files, but it takes a second to compile that one file. On average (I took count today) when I changed a header file it compiled on average 5 .cpp files. And it (again on average) took about 5 seconds to build.

Very rarely should I be required to re-build the entire project, and most of the time I’m don’t. And that saves me a lot of time. Every single day.

Multiple Configurations
This is mainly a bugbear of mine rather than a direct issue with Unity Builds, but I see it in nearly every Unity Build project I use. The main project is the Unity Build project, but there is another project that is built and maintained that doesn’t use Unity Builds. Now there is a point here – by having an additional ‘normal’ project you are forcing the modularity that can collapse with Unity Builds to be checked (usually a Continuous Build server will be building this as well as the Unity Build every time).

But we have problems with this.

Firstly, the non-unity build is only ever being built on the CB server. So any problems are going to break the build, and it will break if people are not using it day-to-day. Secondly you now have multiple projects to maintain. Not too much of a problem if you have an automated project generation step (see below) but it is still another project to maintain.

People may occasionally have to use the non-unity configurations, especially if they are getting an unknown error on the CB server. So now they are left working on a configuration that is uncared for and probably builds so slowly and erratically that they are probably losing all the time they saved from those quick unity builds they have been doing all day.

But What About My Build Times Then?
Well structured software builds quickly (or as quickly as they can anyway). But what is a well structured project when it comes to build times?

  • Sensible file inclusion – Only include the files you need and try as best you can to limit them to .cpp files. This means the compiler only needs to bring in what’s necessary and when you change one of these header files only those things that directly rely on it will change. If you find yourself having to include files that constantly change into a large number of cpp files, then I’d wager that a refactoring of the large header file would be in order. You should be building only a small number of cpp files when you change the content of a header file.
  • Forward Declarations – Where possible use forward declarations rather than including the header file in your class or function declaration. Annoyingly you cannot forward declare enums (on standard compliant compliers anyway) which sometimes throws this out of the window. But by not including the files until use, you’re cutting down on the amount of code being included and the number of files being opened.
  • Pre-compiled Headers – Use Pre-compiled Headers (PCH). Using PCH’s is the one (built into the IDE) feature that will cause your build times to plummet, especially if you are being sensible with them (such as including files that don’t change – SDK’s and common, fixed headers for example). Using pre-compiled headers across multiple platforms can be a pain, but it is possible to unify them and get a massive boost. I’ll cover these a little bit more below.
  • Library Files – Modular code usually leads towards easily extracting common code into a collection of libs. Reducing the amount of code in a project (and as a result how much you need to build) can speed up your productivity quickly.
  • Parallel Building – If you’re IDE supports it (and it might do and you just don’t know) and you have a multi-core machine, turn on parallel building of files. Building individual files at the same time is obviously going to cut down on your compile times no matter how quick they are at the moment.
  • Get a Better PC – It goes without saying (but I will anyway) doing this will speed everything up.

Pre-Compiled Headers
Pre-compiled headers are one of the best ways to get a massive boost to your compile times. So how fast are my build times compared to that of a Unity Build version when using PCH’s and taking into account the other build improvements suggestions?

As stated above the ‘average’ build time of a minimal build throughout the day was around 5 seconds. On a Unity Build it was a full build every time and was around 2 minutes on the slowest platform.

Changes to ‘core’ files, which are included by a higher than average number of files, resulted in builds of around 30 seconds on around 20 files. Again on a Unity Build this would have been around 2 minutes regardless.

Full rebuild (which I did twice today) was around 3 minutes. Granted a whole minute slower than a Unity Build but I did it twice rather than every single time.

Pre-compiled headers are not a silver bullet solution. Nothing is. And because of this here are issues that you do need to be aware of

  • Compiler Support – Some compilers out there simply do not support PCH’s. On a daily basis I use between 4 or 5 different compilers and while I’ve never encountered one that doesn’t, they are out there. This can shoot my argument in the foot, but it is a rare problem and one most people won’t encounter.
  • PCH Quirks – While I use a number of compilers that do support PCH’s, every single one of them has a slightly different way of building them and slightly different requirements before they can be used. This doesn’t affect the code that is written but does affect the content of your project, especially if you want to make them seem as consistent as possible.
  • Over-Inclusion – Because your PCH usually includes common files and files that rarely change, it does mean that some files are being brought into the project in a compilation unit that wouldn’t otherwise be required

Unity builds are a solution for the wrong problem. Long compile times are not caused by not using Unity Builds, they are the result of having badly structured and badly referenced code. Fix that (and have better code to boot) and you’ll be able to use ‘proper’ build tools without resorting to a quick fix.

Making Unity Builds Better
I don’t want to just leave it at that, because no matter how much I argue, Unity Builds will always exist and people will always use them (after all it’s quicker to rescue a broken down build by making it a Unity Build than doing it the hard way). So what can people do to actually make Unity Builds a bit more desirable and less of a problematic fix?

  • Automate Adding Files – A lot of teams have auto project generation code already (XML files, template projects etc.) so it’s important that you automate adding files to the project, otherwise people will forget to remove the file from the build step and they will forget to add it to the unity build file.
  • Multiple Unity Files – Running a Unity Build doesn’t require you to have one unity file with every cpp file inside it. You can have multiple build files (usually per module or per component) which means at least some of the translation unit leaking is limited to each module rather than the whole program.
  • Additional Project – No, this isn’t a contradiction from the above “Multiple Configurations” comment above. In this situation you will have a project that contains the cpp and header files so you can reference them but this project isn’t built. Instead, you have the ‘proper’ project simply contain the unity file(s). This isn’t something I’ve personally tried, but it does get around the issues of adding files if you don’t have an automated step.

8 comments

  1. It’s a real shame that compilers suffer from performance degradation with lots of small files (no matter how tidy you try to keep your includes) – it almost feels like they punish you for writing good code.

  2. @Tom: Compilers suffer from performance degradation due to the file I/O overhead. It’s not the fault of the compiler. It’s certainly not the thing punishing you. Don’t assume that lots of files is the same as “good code” either. I’ve seen more than enough examples of projects that have a lot of files but are extremely poorly written.

    @Lee: This is a fantastic article. Thanks for showing the other side of the argument. It’s good to see someone putting an opposing view up on the web without being an arse about it 🙂 Great stuff mate. There should be more posts like this. I’d like to make a few points while I’m here.

    What you refer to as “local properties” are actually just file-static variables in an anonymous namespace. You are right, do you essentially lose the “protection” you get when using this mechanism. However, is this a bad thing? I’m not sure it is. Your example shows two uses of the same variable name. If they are specific to context, then they should live in that context. In a C++ application, where your CPP files contain class definitions, any constant that is specific to that implementation should be part of the class. I would argue that “StandardTestCount”, if it is indeed a constant that changes depending on the class it’s related to, belongs as a private const in that class. Again, to use your example:

    // VectorTest.h
    class VectorTest
    {
    private const int StandardTestCount = 10;
    }

    // ListTest.h
    class ListTest
    {
    private const int StandardTestCount = 3;
    }

    This completely removes the need for the anonymous namespace. If the value isn’t specific to a single class, then again there’s nothing wrong with them having their own instances, or sharing them via inheritance. This to me is not necessarily a great argument against Unity Builds. If anything, I would argue that your example is untidy and is exactly the kind of thing that Unity Builds force you to rethink and clean up. Which is a good thing in my view 🙂 So I stand by my “proper code” comment 😉

    The use of C-style statics in C++ is taboo, and I wouldn’t ever consider it as part of the argument. Anonymous namespaces were designed to replace that mechanism, and them in turn are (just like the C-style statics) just a shorthand/lazy way of defining file-level globals that really should be class-scoped.

    You say “never use the using keyword in a header file”. I agree, but I’d extend that further to say: Never use the using keyword in the global namespace. If you’re keen to use the using statement to reduce typing, specify your usings inside your functions and classes. Polluting the global namespace with usings is a bad idea. If you keep your usings as part of class or function scope then, again, you don’t have any issues when you come to use Unity Builds. Again, this sits in the “proper coding” bucket in my view 🙂

    “Every change is a rebuild” – Agreed. So long as you have only one unity file for your whole application. What tends to be the best option, especially for large code bases, is to break up your Unity file into different sub-files. If you’re working in games, have a GraphicsUnity.cpp which has all the CPP files for the graphics subsystem. PhysicsUnity.cpp, GameplayUnity.cpp, etc. While this doesn’t remove the issue completely, it prevents a rebuild of subsystems which aren’t affected by your changes. You’ll still get a speed up in build time and have the benefits of some incremental builds too. Of course, if you change something in the subsystem, the whole subsystem will be rebuilt. That’s definitely a given. The good thing is that those rebuilds tend to be much faster. Your point is a good one though.

    “Multiple Configurations” — a bugbear for me too 🙂 Even with different projects/solutions it’s painful. This is one of the management issues that I do hate with Unity Builds, but it’s not enough to detract me from using them as I feel the benefits by far outweigh this issue.

    Your list of other options for improving build times is a good one. The only one I don’t agree with is the last one, because a new machine will speed up Unity Builds too 🙂 Having said that, even with those improvements, it’s hard to beat a Unity Build in a full rebuild scenario.

    Anyway, thanks for writing your post. It was a great read. I always enjoy well-rounded responses. All the best mate, and thanks for the discussion!

    OJ

  3. Thanks for the excellent comment OJ, it’s always good to get a detailed reply to a blog post. I’d like to comment on some of the points you have raised.

    “Don’t assume that lots of files is the same as “good code” either. I’ve seen more than enough examples of projects that have a lot of files but are extremely poorly written.”

    True, badly written code can contain (to) many files, but well written code will always be well distributed across multiple files.

    “If they are specific to context, then they should live in that context. In a C++ application, where your CPP files contain class definitions, any constant that is specific to that implementation should be part of the class. I would argue that “StandardTestCount”, if it is indeed a constant that changes depending on the class it’s related to, belongs as a private const in that class.”

    Your example is spot on, and I use this a lot in any code base I’m working on, but it depends on two things.

    One is that everything in the project is related to, or belongs to a class. I’m a strong believer that just because I’m using C++ it doesn’t mean that everything needs to be, or should be related to a class. The actual example taken from was a case of a large number of stand alone test functions (similar in nature to unit test functions) where placing the variable in a class wasn’t possible. This could also apply quite easily to non-friend, non-member functions.

    The second is that the scope of the variable isn’t suitable for the (private or not) class interface. By placing it in the class scope you are (ever so slightly) increasing the scope of the variable, where-as keeping it in a file scope is a tad reduced. The line between when I would place something in a class scope or file scope is sometimes grey (usually based on what feels right at the time or how it’s used) but to me file scoping is just as valid as class scoping (where available) so I wouldn’t necessarily see it as untidy, just a bit more flexible.

    Placing it inside the class definition also increases the visibility of the change which will increase compile times for a small change in an incremental build system (but that is along the same arguments as placing it in the class having no effect on Unity Builds so isn’t the best argument either).

    “If you’re working in games, have a GraphicsUnity.cpp which has all the CPP files for the graphics subsystem. PhysicsUnity.cpp, GameplayUnity.cpp, etc. While this doesn’t remove the issue completely, it prevents a rebuild of subsystems which aren’t affected by your changes.”

    This is one thing I would always insist on if a project I was working on was using Unity Builds and it wasn’t an option to change it. But I think it’s an option that not every one sees (as with a lot of things people usually go all out – and have one file for everything – or don’t do it at all) but it’s one which can make the process much clearer and much easier to manager.

    “… it’s hard to beat a Unity Build in a full rebuild scenario.”

    I don’t think you will ever get anything faster than a unity build when doing a full rebuild which is why they are so popular especially in cases where a rebuild happens way more that it should. But I’d argue that you shouldn’t be doing rebuilds that often (I’m doing on about once a day though I always do a full rebuild when creating submission builds). Shorter incremental builds are my preferred methods for reducing the time spent waiting for a build.

    Again, thanks for the well thought out response 🙂

    Lee

  4. Hello,

    Just a comment on the multiple configurations point.

    At work, someone wrote a tool which adds the files to the unity build automatically based on the non-unity project during a pre-build step. So everyone works on the main project and they’d switch to the unity build when needed, cutting out a lot of the maintenance woes you mention in the post.

    Si

  5. I share your dislike of unity builds, but I really think it speaks to flaws in C++ compiler and linker architectures that they are prevalent. When presented with a large, legacy code base that was not written in a very modular fashion, and can not be significantly altered due to not controlling the bulk of the code, it can be a necessary evil. An automated tool to generate the unity files is a must.

    Still, if simple rearrangement of the files can cause such drastic increases (order of magnitude) in compiler and even linker performance, then that makes me think that such gains could be made within the compiler tool chain itself when dealing with large projects. So it’s a shame that this can be a win in certain code bases at all.

  6. The comments regarding automatic adding of files to a unity build project, or the automatic generation of the unity project itself are spot on. As with anything that takes a few monotonous steps every single time (adding files to multiple projects/configurations and/or configuring multiple projects) really need to be automated otherwise problems will always start to creep in.

    @vince There is probably something to said for that, though I don’t know enough about C++ compilers to know what those improvements could be. But since there list of ‘good practices’ is so small you would think it possible.

  7. I have to disagree with OJ as well on the notion that these statically linked symbols need to go in as private members of a class. I would even venture to say that this could be poor class design in some cases to make things a private member of a class as it increases compile-time code interdependencies, and as previously pointed out, partial builds should be a far more common case to optimize than full rebuilds. We should strive to omit irrelevant implementation details from our header files as much as possible.

  8. Pingback: Taking Initiative

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s