C++

Accu 2014 Conference Notes

I had the chance to go to ACCU 2014 the other week (full conference schedule is here) and I have to say it was one of the best conferences I’ve had the pleasure to attend. And while it did confirm my idea that C++ is getting the point of saturation and ridiculous excess (C++11 was needed, as a result so was C++14, but C++17… Just use a more suitable language if these are the features you need), the talks I went to, on the whole, we’re great.

So I thought I may as well post up the notes I made from each talk – and while they might be a bit of a brain dump of the time, if there’s something that sparks your interest, I’m sure the full presentations will be posted up at some point soon.

 

Get Archaeology
Charles Bailey, Bloomberg LP

Interesting talk looking at some of the more esoteric ways of presenting and searching for information within an existing Git repository. Unfortunately, time was short so the speaker had to rush through the last part, which was, for me, the most relevant and interesting part.

Talk notes

 

Designing C++ Headers Best Practices
Alan Griffiths

For most experienced C++ developers the content of this talk is probably quite basic, as you’d have a large portion of this covered already through sheer trial and error over the years. But clarifying some good practices and interesting side effects was enjoyable.

Talk notes

 

Version Control – Patterns and Practices
Chris Oldwood, chrisoldwood.blogspot.com

Highlevel overview of the various patterns we use when using version control (focused mostly on DVC), some useful examples and some interesting discussion about trust issues…

Talk notes

 

Performance Choices
Dietmar Kuhl, Bloomberg LP

Proof that if you want to know about optimisation and performance, stick to asking people in the games industry.

I’m not posting these notes.

 

Crafting More Effective Technical Presentation
Dirk Haun, http://www.themobilepresenter.com

Really good presentation on how to craft good presentations – some interesting discussions on the make up of the human brain, why certain techniques work and why the vast majority of technical talks (or just talks in general to be honest) do what they do.

Talk notes

 

The Evolution of Good Code
Arjan van Leeuwen, Opera Software

Great talk, not telling us what good code is, but examining a few in-vougue books over the last decade to see where they sit on various contentious topics. Note that when the notes say “no-one argued for/against” it’s just referencing the books being discussed!

Talk notes

 

Software Quality Dashboard for Agile Teams
Alexander Bogush

Brilliant talk about the various metrics Alexander uses to measure the quality of their code base. If you’re sick of agile, don’t let the title put you off, this is valuable reading regardless of your development methodology.

Talk notes

 

Automated Test Hell (or There and Back Again)
Wojciech Seiga, JIRA product manager

Another great talk, this time discussing how the JIRA team took a (very) legacy project with a (very) large and fragile test structure into something much more suitable to quick iteration and testing. When someone says some of their tests took 8 days to complete, you have to wonder how they didn’t just throw it all away!

Talk notes

 

Why Agile Doesn’t Scale – and what you can do about it
Dan North, http://dannorth.net

Interesting talk, arguing that Agile is simply not designed to, nor was it ever imaged to, be a scalable development methodology (where scale is defined as bigger teams and wider problems). Excellently covered why and how agile adoption fails and how this can be avoided to the point where agile principles can be used on much larger and less flexible teams.

Talk notes

 

Biggest Mistakes in C++11
Nicolai M Josuttis, IT-communications.com

Entertaining talk where Nicolai, a member of the C++ Standard Committee library working group, covers various features of C++11 that simply didn’t work or shouldn’t have been included in the standard. When it gets to the point that Standard Committee members in the audience are arguing about how something should or does work, you know they’ve taken it to far.

Talk notes

 

Everything You Ever Wanted to Know About Move Semantics (and then some…)
Howard Hinnant, Ripple Labs

Detailed talk on move semantics which are never as complicated as they (and even the speaker was) are made out to seem. There’s some dumb issues, which seem to be increasing as the C++ standards committee don’t seem to understand the scale of the changes being made, but never-the-less it was a good overview of what is a key improvement to C++.

Talk notes

A Different Allocator Model

I originally posted this to #AltDevBlogADay on Saturday 30th July 2011.

Different... Quite a few years back I started developing a custom STL implementation which has eventually be adopted and used throughout our company. One of the most important things to get right when developing any kind of generic container library is the way memory is allocated. It needs to be lightweight, highly flexible and above all easy to understand so people are willing to experiment it.

But STL Allocators have a bad reputation and it’s for a good reason. They are complex, hard to understand and have some interesting behaviour that seems designed to confuse. As a result, I needed to look at how we were going to provide a custom allocation system that was both easy to use and simple to understand without restricting what people could do with them.

A Bit of History
A while back Bitsquid published a post entitled Custom Memory Allocation in C++. This excellent post covered how the BitSquid engine used an allocator model to perform memory allocation throughout their entire engine.

But the FTL requires a slightly different approach so I won’t be treading over already covered ground here.

FTL allocators are well hidden inside containers, the only control you have is specifying the allocator type which means it’s use, and how and when objects are allocated and created is completely fixed with only the allocator itself being able to effect the outcome. Because of this the allocator behaviour needs to be as customisable as possible without requiring any changes the container itself.

When the FTL was original started, it was quite small scale and only used by a couple of developers, so allocators were not a priority. The flexibility wasn’t needed so in-container malloc and free allowed us to concentrate on container creation but obviously this wasn’t going to last.

The follow just describes the various approaches we took, why we dismissed them and what we eventually ended up with.

The Initial Approach
Allocators should have a minimal over-head. What we didn’t want to do was increase the size of every container dramatically when we rolled out customisable allocators. As a result, we initially used an approach used by a couple of vendors and defined the allocator specification using only static members – removing the need for an internal allocator object.

// Completely made up code showing the use of a static based allocator
template <typename T, typename Alloc>
class vector
{
  void  push_back(void)
  {
    void* new_memory = Alloc::allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

I knew this would limit the flexibility of the allocators, but it had minimal over-head (especially using default template parameters) and wouldn’t effect those containers already being used. And besides, programmers are a creative bunch and I wanted to see what people did with this before resigning it to the scrap heap.

But while programmers were able to work around the main limitation of not having allocator state per container, they were having to jump through hoops to get the behaviour they wanted. Which made it less likely that other programmers would feel confident enough writing their own allocators and their ability to really customise their behaviour was too limited.

So we ended up adding allocator state on a per container basis, making it much easier for developers to do what they wanted though it did add at least 4 bytes per container. But I felt that flexibility and simplicity were much more important.

// Completely made up code showing the use of an instanced allocator
template <typename T, typename Alloc>
class vector
{
  Alloc m_allocator;

  void  push_back(void)
  {
    void* new_memory = m_allocator.allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

Allocator Specification
The allocator specification is complicated. While I’m sure there are good reasons for some of the decisions I wanted ours it to be much simpler. So removing the type information (since our allocators work on raw memory and nothing else), removing the rebind(!) and exception handling (which we don’t use) we ended up with the following.

typedef ftl::pair<void*, size_t>  allocated_memory;

class Alloc
{
public:
  allocated_memory   allocate(const size_t alloc_size);
  void               deallocate(void* ptr);
};

And for the basic allocator, that’s it, it doesn’t require anything else to work, but it does have one interesting aspect.

typedef ftl::pair<void*, size_t>  allocated_memory;

As we can have all sorts of allocator types, what it allocates might not be exactly what you asked for. If it can’t allocate all the memory you requested, that’s fine, it simply returns allocated_memory(nullptr, 0) but it can also return more than was requested (for example, fixed sized block allocators will do this). This return type simply allows the allocator to return not only the allocated memory, but also how much was allocated, which allows calling objects to take advantage of this additional memory if possible.

Most of the time this isn’t queried and only what was asked for is given, but it offers an additional level of information which might allow better memory usage in some containers.

So a container will most likely end up with something like the following when adding and removing elements.

// Creating a new object in a container
allocated_memory new_memory = m_allocator.allocate( sizeof(T) );
if (new_memory.first)
  T* new_object = new(new_memory.first) T;

// Losing the object now we’re done with it
old_object->~T();
m_allocator.deallocate(old_object);

This is fine and gives us a very simple entry point for allocators. But by forcing the use of placement new and the destructor call in the container itself, we are limiting what an allocator can do. While the allocators are required to return raw memory, that doesn’t mean that it has to internally. Some allocators might pre-create the objects before returning them so creation is front loaded but the forced placement new could mean we’re over-riding an object that has already been created.

Construction Functions
As a result, we want developers to be able to not only over-ride the memory allocation, but also the object creation. 99% of allocators won’t need to do this so we don’t want to add additional requirements to the allocator so instead we can create non-member, non-friend functions, specialised on the allocator, which will do the creation for us.

template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator, const TConstruct& obj);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem, const TConstruct& obj);

template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr);

template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr, TAlloc& allocator);

So our point of allocation/creation now becomes something much simpler and much more powerful.

// Creating a new object in a container
T* new_object = alloc::construct<T>(m_alloc);

// Lose our existing object and return it back to the allocator
alloc::destroy(old_object, m_alloc);

The default version of construct performs the allocation and placement new within the construct function, but should the allocator need something more (or less), simply over-loading the function on the allocator type allows complete control over both memory allocation and object creation. The same goes for the destroy function and the automatic call of the objects destructor.

No Base Allocator
One thing I didn’t add to the allocators was a base allocator interface. The allocators are created within the container, with their type defined at the point of the containers declaration. The addition of a base interface (and the associated virtual functions) would have increase the size over-head of the allocator which is something I wanted to avoid and something I thought didn’t add enough to warrant the overhead. I was less worried about the overhead of calling a virtual function as that would be insignificant compared to the overhead of everything else what was going on.

Conclusion
So by introducing an allocator model that is much simpler (only 2 required functions) with more extensible customisation should an allocator should need it, developers have complete control over all the memory allocation and importantly the object creation itself. Due to it’s initial simplicity, developers have no problem creating new allocators that improve both memory and container use in a project and can start to experiment with better and more efficient allocators quickly.

Title image by Squeezy.  Used with permission.

Simplifying the Flag Set

In the last few posts, I’ve covered the development of a type safe and easy to use flag set, and as a result it now contains 3 required template parameters, two of which declare default values.

template <
          typename TEnum,
          int TMaxFlags = 8,
          typename TNames = union_names<bit_set_size<TMaxFlags>::Value>
         >
class flag_set

The last two properties have default values, but to be honest, people don’t want to have to add a Noof option to the declaration just to define the flag names.  And they don’t want to keep adding long type declarations, even when using typedef’s to cut down on the clutter.

We have the following coding standard at work which can provide an interesting structure defining both our flags and flag names

Enums should be defined within a structure to provide proper name scoping

struct MyProperty
{
  enum Enum
  {
    FlagEntry1,
    FlagEntry2,
  };
};
MyProperty::Enum property = MyProperty::FlagEntry1;

In the past enums were defined with the name of the enum appended to the names of its entries. While this does reduce name conflicts it is a poor way to scope a property.

Since we can guarantee that our enums are defined like the above, we can make the assumption that people will (or will be happy to) add their flag names as part of this structure.

// Structure defining the flag enums and also
// defining the flag names which we need
struct StateFlags
{
  // Flags
  enum Enum
  {
    HasTarget,
    CarryingWeapon,
    Drugged,

    Noof,
  };

  // Names
  struct Names
  {
    uint HasTarget:1;
    uint CarryingWeapon:1;
    uint Drugged:1;
  };
};

Making that assumption allows us to define the following structure

template <typename TFlags>
struct flag_set_ex
{
  typedef flag_set <
                     typename TFlags::Enum,
                     TFlags::Noof,
                     typename TFlags::Names
                   > flags;
};

This then allows users to define their flags sets quickly using all the available properties

flag_set_ex<StateFlags>::flags myStateFlags;

If programmers don’t want to use the flag names, they can obviously use the base flag set type. And to keep things consistent, we can add the a ‘::flags’ typedef to the base flag set allowing the definitions to be the same regardless of the flag set type being used.

template < typename EnumT, uint SizeT, typename NamesT >
class flag_set
{
public:
  typedef flag_set<EnumT, SizeT, NamesT>     flags;
};

// Define a normal flag set with the same syntax as the flag_set_ex making
// it much easier to interchange the types depending on their needs
flag_set<StateFlags::Enums>::flags myStateFlags;

Debugging Type Safe Flags

In the previous post I described how I implemented a type safe, template driven, system for declaring and using flags in C++. But one of the main issues with the solution was not being able to see what flags were set when debugging.

Which Flags Are Set?
When setting bits to store flags there are varying levels of output you can get from them in the debugger.

One of them absolutely sucks – you don’t want to be doing bit arithmetic to know whats been set

The next one is better, it gives you some idea of what’s been set, but not much.

Finally, there is the following, which is much better.

A normal way of getting this behaviour out of the debugger when using base type flags is declaring them alongside a structure containing the individual flag entries inside a union.

struct  StateFlags
{
  union
  {
    uint m_allFlags;
    struct
    {
      uint HasTarget:1;
      uint CarryingWeapon:1;
      uint Drugged:1;
    } m_names;
  };
};

This provides you with the best possible output in the debugger so it’s something I wanted to add to the flag set to make debugging much easier.

Flag Set ‘Names’
Since this is an interface driven implementation the ideal syntax starts out as the following

// The enums used to define the flag set
enum StateFlags
{
  HasTarget,
  CarryingWeapon,
  Drugged,
};

// The ‘names’ we’d like to be displayed in the debugger
struct StateNames
{
  uint HasTarget;
  uint CarryingWeapon;
  uint Drugged;
};

We need to be able to pass these names through to the container, and since we want the type defined on a per flag basis we can add these names to the flag set definition.

template &lt;typename TEnum, int TMaxFlags, typename TNames&gt;
class flag_set

Since we already have our own defined type to store the flags being set and we know that we can use a union to combine the storage type and the flag names, we simply append our name type along side the storage type so when a flag is set the name is automatically set alongside it.

template&lt; int TMaxFlags, typename TNames &gt;
struct bit_union
{
  // Container type with enough bits to store TMaxFlags flags
  typedef bit_set&lt;TMaxFlags&gt; container_type;
  typedef TNames name_type;

  union
  {
    container_type m_type;
    name_type m_name;
  };
};

I’ve extracted the bit_set into it’s own structure rather than including it directly inside the union for two reasons

  • It allows the bit set to be specialised based on the number of flags without worrying about the union. If we had a large number of specilised storage structures, we’d have to duplicate the union inside each one.
  • It allows us to strip out the union and the names in builds that don’t need it, without affecting the speclialised bit sets.

So the flag set now contains user defined names without having to worry about how they are implemented.  The templated utility functions used to set, remove and check the flags also don’t need to know anything about it so aren’t effected.

template &lt; typename TEnum, int TMaxFlags, typename TNames &gt;
class flag_set
{
public:
  void set(const TEnum flag)
  {
    // The utility functions still take the same
    // type regardless of the name union
    set_bit_entry(m_unionFlags.m_type.m_bitArray, flag, true)
  }

private:
  static const uint BitSetSize = ((( TMaxFlags + 7 ) &amp; ~7 ) &gt;&gt; 3)*8;
  bit_union&lt; BitSetSize, TNames &gt; m_unionFlags;
};

Now we can simply define our flag set with a unique set of names and in the debugger we get exactly what we we’re looking for (annoyingly we can’t get around the need to define the integers in the structure as one bit entries if we’re going to add them to the union).

struct MyFlags
{
  enum Enum
  {
    UnderAttack,
    InCover,
    // etc.

    Noof,
  };

  struct Names
  {
    uint UnderAttack:1;
    uint InCover:1;
    // etc.
  };
};

flag_set&lt;MyFlags::Enum, MyFlags::Noof, MyFlags::Names&gt; flagSet;

flagSet.set(MyFlags::InFormation);
flagSet.set(MyFlags::KnockedDown);

Default Names
But we have one problem.  We don’t want to force people to have flag names if they don’t want them. Depending on the type and volatility of the flags they might not need the names in the debugger and it can be a lot of code to add for every single flag set in use.  And since we have a default template parameter for the number of flags, we need to provide a similar default parameter for the names.

Defining the default parameters isn’t much of a problem.  We can quickly define union name companions for 8+ flags (a few small macros make this much easier) but how to drop them into the flag set template as a multiple of 8?

Since we already calculate our multiple of 8 flag count when defining the bit_union type, we can do the same when declaring the default name structure.  But we certainly don’t want such an over-complicated calculation twice in the class.

Since we need the multiple of 8 value outside the class in the template declaration, we can’t just declare it inside the class, so by storing it in it’s own type we can use it where-ever we need it.

template&lt;uint TSize&gt;
struct bit_set_size
{
  enum
  {
    Value = ((( TSize + 7 ) &amp; ~7 ) &gt;&gt; 3)*8,
  };
};

We can now use this to declare our default flag set name where union names is correctly specialised depending on how many flags the user has declared.

template &lt;
           typename TEnum,
           int TMaxFlags = 8,
           typename TNames = union_names&lt;bit_set_size&lt;TMaxFlags&gt;::Value&gt;
         &gt;
class flag_set
{
private:
  bit_union&lt; bit_set_size&lt;TMaxFlags&gt;::Value &gt; m_unionFlags;
};

So now a user can simply declare their flag set with the enum type and by default they will get something out of the debugger that’s more useful than just showing the single base type value.

flag_set&lt;MyFlags::Enum&gt; flagSet;

flagSet.set(MyFlags::InFormation);
flagSet.set(MyFlags::KnockedDown);

Flag Set Drawbacks
In adding custom names to the flag set, we’ve added a number of new templated structures.  The default name types, the bit_set_size and the bit_union and all this comes with a cost.  Obviously we’re generating a number of structures for all the flags that use this system and it certainly does effect compilation speed.

Based on compilation speeds of a couple of projects that use the flag set heavily over a few months, I’d estimate that adding the flag set (before the debug names were added) put an extra 20-30 seconds on a full rebuild.  After adding the debug names, it probably added another 30 seconds or more, bringing a rebuild of the entire project up from 3 minutes to around 4.

But I can live with that for a number of reasons

  • The flag set has introduced a more type safe environment for developers.  On a project that has a number of developers dipping into it, this is vital and does reduce the possibility of human errors causing simple but difficult to track down bugs
  • The code is clearer and easier to read, maintain and debug.  This is a bonus for all the same reasons above.
  • I rarely rebuild the entire project.  A well structured project reduces the number of rebuilds greatly and as such the longer rebuilds actually don’t effect me that much (an average build usually takes <10 seconds)

There’s one more thing I want to look in regards to the flag set, and that’s how we can take the basic template declaration and simplify it so users can quickly and easily declare a flag set with a large number of flags and their own flag names within a single structure.

But I’ll cover that in another post.

A C++ Flag Set Example

As was pointed out in the last post, I… ahem… forgot to actually put down some working examples of the flag set, so I ended up describing something but not actually showing off the final product.

So I thought a good start would be to include a few of the flag set unit tests (since unit tested code is known to be good ‘working’ documentation in itself) followed by a simple working example based on the examples used at the start of the last post.

So, here’s a load of code that you can feast you eyes on (the unit test library being used here is UnitTest++)

struct TestFlagSet_Small
{
  enum Enum
  {
    Flag1,
    Flag2,
    Flag3,
    Flag4,

    Noof,
  };
};

struct TestFlagSet_Large
{
  enum Enum
  {
    Flag1,
    Flag2,
    Flag3,
    Flag4,
    Flag5,
    Flag6,
    Flag7,
    Flag8,
    Flag9,
    Flag10,
    Flag11,
    Flag12,

    Noof,
  };
};

class ExampleFixture
{
public:
  // Define a couple of flags, one defaulting to 8 flags
  flag_set<TestFlagSet_Small::Enum> smallFlagSet;
  flag_set<TestFlagSet_Large::Enum, TestFlagSet_Large::Noof> largeFlagSet;

  ExampleFixture(void)
  {
  }

  ~ExampleFixture(void)
  {
  }
};

// Checks that the flag set starts in an empty state
TEST_FIXTURE(ExampleFixture, InitialFlagSet_Empty)
{
  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
  CHECK_EQUAL(largeFlagSet.is_clear(), TRUE);
}

// Tests that compilation works when defining flags with sizes < 8
TEST(DefaultTemplateSet)
{
  flag_set<TestFlagSet_Small::Enum, TestFlagSet_Small::Noof>    smallFlagSet;
  flag_set<TestFlagSet_Large::Enum, TestFlagSet_Large::Noof>    largeFlagSet;
}

// Checks if the flag set is clear when flag set removed
TEST_FIXTURE(ExampleFixture, SetThenRemoveIsClear)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.remove(TestFlagSet_Small::Flag1);

  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
}

// Checks if the flag set is clear when cleared or no flags are set
TEST_FIXTURE(ExampleFixture, ClearSetIsClear)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.clear();

  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
}

// Checks if the set function correctly sets the right flags
TEST_FIXTURE(ExampleFixture, SetFlagSetIsSet_IsSet)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag1), TRUE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag4), TRUE);
}

// Check that returned C type is correctly set
TEST_FIXTURE(ExampleFixture, AsCTypeFunction)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag2);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  // as_c_type allows the flag set to be explicitly passed
  // through to legacy functions and for the underlying
  // datatype to be stored as a base type is needed
  const uint8* flagSetBits = smallFlagSet.as_c_type();

  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag1) );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag2) );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag3) == FALSE );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag4) );
}

// Test that setting from a C type correctly sets the flags
TEST_FIXTURE(ExampleFixture, FromCTypeFunction)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag2);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  const uint8 tempStorage = *(smallFlagSet.as_c_type());

  smallFlagSet.clear();
  smallFlagSet.from_c_type(&tempStorage);

  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag1), FALSE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag2), FALSE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag4), FALSE);
}

// And a (very) simple real world example

// Define a couple of flag types
struct BattleProperties
{
  enum Enum
  {
    HasTarget,
    CarryingWeapon,
  };
};

struct CharacterState
{
  enum Enum
  {
    IsStunned,
    IsPoisoned,
  };
};

// Create a flag set for the character states
typedef flag_set<CharacterState::Enum> StateFlags;
StateFlags m_currentStateFlags;

void UpdateCharacter(void)
{
  // We’ve been stunned
  m_currentStateFlags.set(CharacterState::IsStunned);

  // ... do some more stuff

  // Whoops, I would have just made a mistake, but I can’t do this
  // m_currentStateFlags.set(BattleProperties::CarryingWeapon);

  // Update what ever has happened
  UpdateState(m_currentStateFlags);

  // We’ve dealt with our states now
  m_currentStateFlags.clear();
}

void UpdateState(const StateFlags& state)
{
  // Are we stunned?
  if (state.is_set(CharacterState::IsStunned)
  {
  }

  // Another mistake I can no longer do
  // if (state.is_set(BattleProperties::CarryingWeapon)
  // {
  // }
}

Type Safe Flags in C++

Using bits and shifting enums is a pretty standard way of storing and reading flags. It’s quick, it’s compact and the syntax isn’t to difficult to pick up for people who haven’t seen it before. But since a bit is only a bit and an enum is only an enum it’s far to easy to make mistakes with these. There is no type checking when you accidentally use the wrong enum and passing an unsigned int around loses all relevance to what these ‘flags’ actually are.

So I wanted to look for a solution that not only stopped us accidentally using the wrong flag with the wrong variable (and allow the compiler to let us know if it happens) but to also to remove the loss of relevance when the flags were passed between different systems.

The Problem
Anyone who’s taken Programming 101 will know what I’m talking about.

enum CharacterBattleProperties
{
  HasTarget = (1<<0),
  CarryingWeapon = (1<<1),
};

enum CharacterState
{
  IsStunned = (1<<0),
  IsPoisoned = (1<<1),
};

unsigned int battleFlags;
unsigned int stateFlags;

// Set or remove some flags based on whats happening
// Character has just lost sight of the player
battleFlags &= ~HasTarget;

// We've just picked up a weapon
battleFlags |= CarryingWeapon;

// ...

// Some more code, some more function calls and
// now we have a player who is stunned
battleFlags |= IsStunned; 

// Whoops, I just set the wrong flag on the wrong variable...

The First Solution
While investigating a couple of solutions I came across a post on Game Architect titled Evolution of a Flag Set Class which examined a lot of the questions I was asking myself. And it took it pretty far, coming up with a set of solutions that more-or-less did what I was looking for, but I wanted something that would be easier to scale and specialise based on the number of flags we had and not one based on an integer as the smallest possible storage unit.

So in trying to keep it as simple as possible the flag set initially ended up with a very simple interface

// Stripping it down to it's most basic behaviour
template < typename TEnum, typename TBaseType = unsigned char >
class flag_set
{
public:
    void set(const TEnum flag)
    {
      m_flags |= (1 << flag);
    }

    void remove(const TEnum flag)
    {
      m_flags &= ~(1 << flag);
    }

private:
    TBaseType m_flags;
};

You’ll notice a couple of things with this implementation (on top of ‘remove’ being highlighted in pink for some probably awesome reason)

  • The first is the default value of the second template parameter is set to an unsigned char, meaning that in the majority of cases the variable can be declared simply with the enum type but it’s easy to increase as needed (though it obviously has limits).
  • The second is that the object itself deals with the shifting of the enum value. This was a conscious decision to remove the idea that when setting a flag you are not ‘setting a bit’ rather you’re setting a property, something I wanted the interface to promote.

And this works pretty well, but contradicts something quite important.

I wanted the flag_set to move away from ‘setting bits’ so why do we have to specify a base type which makes us think about how many bits we need. And since we won’t know how many enum entries we have until we start to set them at runtime, we could end up with a flag_set that doesn’t have enough space to store all our flags and the compiler won’t be able to tell us about it (in real life it asserts when you try to use a flag outside the range of the storage type, but that’s not good enough).

A More Appropriate Solution
So I wanted to look at working with the number of flags, and work on making it have as small a footprint as possible. So taking a step back I started out with the interface I wanted.

template 
class flag_set

Where people could define them as follows

// This will assume there are 8 or fewer flags
flag_set myDefaultSizeFlagSet;  

// This will create enough space for MyFlags::Noof flags
flag_set myDeclaredFlagSet;

So we need a way to create enough storage space for the number of flags we have as compile time and since the base type of a bitset in the FTL is a 32bit unsigned int that’s not a good enough solution. There is no point storing 3 flags in an integer when it could drop into a char. If multiple flag sets were packed together, the smaller the storage space the better.

Creating enough storage space is pretty simple, we store an array of unsigned chars based on how many flags we have.

// Generate the array length based on an 8-bit type
unsigned char m_flags[ ((( TMaxFlags + 7 ) & ~7 ) >> 3) ];

// Figure out which bit to set/remove when needed 
// I'm not worrying about endian code in this example...
const int index = ((( (entry+1) + 7 ) & ~7 ) >> 3)-1;
const int offset = entry - (8*index);

// Set the bit
m_flags[index] |= 1 << offset;

Optimising For Certain Types
So we now have a flag set which allows us to have as many flags as we could possible want (and yes I have seen code with more than 128 ‘flags’ in the past – don’t ask me to justify that, I have no idea…) but with a slight problem. In the original version there were no arrays, no calculations when figuring out which bit to set and no loops to check it’s state.

All of this over complicates what could be just ‘one bit being set on one variable’.

We can easily get around this by specialising the flag set for any types of 8, 16 or 32 bit flags. But I didn’t want to specialise the whole flag_set, especially when we might continue this specialisation up to 64 and 128 bit flags – and specialising a whole class just becomes a nightmare to maintain.

So we extract the storage type into its own structure and specialise that based on the number of flags. This allows the general flag_set behaviour to be defined within the one class, and the length based behaviour to be extracted into it’s own implementation.

// Just a note that the following code is simplified to make it 
// easier to read on a blog post, but the general gist is the same.

template  
struct bit_set
{
  enum
  {
    ArrayLength = ((( TBitCount + 7 ) & ~7 ) >> 3),
  }
  unsigned char m_bitArray[ArrayLength];
};

template <>
struct bit_set<8>
{
  enum
  {
    ArrayLength = 1,
  }
  unsigned char m_bitArray;
};

Then we simply replace our array member in the flag set with an instance of the bit_set (where we calculate the bit_set size as a power of 8 based on the number of flags we specify in the flag_set).

Since the flag_set doesn’t want to worry about what kind of data it’s using, we can replace the in-class implementation with a number of utility functions which not only reduces the amount of code duplication but, when specialised for the same reasons as the bit_set, allows us to have the same interface and to use the same class with only a small number of functions specialised as needed.

// Basic, none specialised function for setting 
// a flag as called from flag_set::set
template 
void set_bit_entry(ContainerT& flags, const uint entry)
{
  // Get the entry point to the bit list
  const int index = ((( (entry+1) + 7 ) & ~7 ) >> 3)-1;
  const int offset = entry - (8*index);
 
  // Set the bit
  flags.m_bitArray[index] |= 1 << offset;
}

// Specilised for a bit count of 8 - no arrays, no additional calculations
template
void set_bit_entry< bit_set >(bit_set& flags, const uint entry)
{
  flags.m_bitArray |= 1 << entry;
}
// So our basic none-specialised flag set looks something like this
// (code simplified to make it a bit more presentable)
template 
class flag_set
{
public:
    void set(const TEnum flag)
    {
      set_bit_entry(m_flags.m_bitArray, flag, true)
    }

    void remove(const TEnum flag)
    {
      set_bit_entry(m_flags.m_bitArray, flag, false)
    }

private:
    static const uint BitSetSize = ((( TMaxFlags + 7 ) & ~7 ) >> 3)*8;
    bit_set m_flags;
};

This method makes is very easy to specialise the flag set for flags that can be stored in a built in type and falling back to an array if the size doesn’t fit into one type. If we want to specialise for 16 or 32 bit flags, then we simply create specialised versions of our utility functions making the flag_set implementation the same no matter how much additional work is being done behind the scenes.

Final Outcome
So we now have a flag set that meets the issues I wanted to resolve head on. Passing a structure with strong type behaviour stops programmers accidentally using the wrong flag with the wrong set (and the compiler’s the one that stops them) and passing a named structure to a function keeps the meaning and reason behind the flags intact.

The code obviously increases in complexity over a simple bitwise operation but from a client point of view, the readability of the client code becomes easier. Calls to .set, .remove and .clear make it very clear what the intention of the code is and makes it easier to avoid simple keyboard mistakes.

It also makes it immeasurably easier to add the debugging features to the flag set, making it easier to see what is set and when. But this post has been long enough and I’ll look at that in my next post.

The Evils of Unity Builds

Unity builds. I don’t like them.

Of all the tools at your disposal to make a build faster, this is the worst. And it’s not just the “hey let’s #include .cpp files” weirdness, but the way that it can make a well structure, modular code base become worse than spaghetti junction at rush hour, and the worse thing is that it’s not the fault of the programmer, especially when something as exceptional as Visual Assist can start helping you create non-modular code because of the way Unity Builds collect the code you write.

What Is a Unity Build?
Unity Builds have nothing to do with the excellent Unity Tool Set. I’ll just clear that up right off the bat.

These Unity Builds are a way of reducing build over-head (specifically opening and closing files and reducing link times by reducing the number of object files generated) and as such are used to drastically speed up build times. I say drastically because they are usually used after a code base has starting generating build times that totally cut down on a programmer’s productivity. This might be 10 minutes, 30 minutes or even hours.

Because Unity Builds give you a quick gain, it’s seen as a pretty magical solution, and when a full build is reduced from 30 to 3 minutes you can see why. But the caveat in that statement? Full builds.

I won’t go through how to generate Unity Builds here as the blog post The Magic of Unity Builds does a great job so have a look there. It’s interesting that I’m linking to a blog post that is in the total opposite direction that I’m coming from, but unless you know both sides of a tale, you don’t know the tale.

So without any more delay, why don’t I like them?

Say Goodbye to Local Properties
Well written code is modular, and modular code relies on a few assumptions. One of those being that a single translation unit (a cpp file and any associated include files) is isolated from other translation units, so (unless you extern the contents) anything defined within a cpp file is not available outside, and as a consequence you can have local objects being called the same things without conflict.

Take for example the following simple code (easily expanded to a real world example I’m sure)

// In VectorTest.cpp
namespace
{
   const uint StandardTestCount = 10;
}
// In ListTest.cpp
namespace
{
   const uint StandardTestCount = 3;
}

In a normal compilation environment these variables are local to the file they are working in, and this is a good thing. Why should the vector tests file care what is defined within another test file?

But if you have a Unity Build with the following then you’re going to have issues…

#include "VectorTest.cpp"
#include "ListTest.cpp"
#include "ArrayTest.cpp"
#include "QueueTest.cpp"

Specifically errors relating to the variable ::StandardTestCount already being defined. So we now have to be aware of what is defined throughout the entire project and resolve any issues, and inevitable this will end up with all variables being pre-appended with (in this example) VectorTest_ and ListTest_ etc.

I don’t want to refer to the ‘Magic’ post to much, as it’s a good article, but there is a statement in there referring directly to this. Specifically the following

“Technically you shouldn’t have this problem if you’re writing “proper” code”

This is wrong. “Proper” code is well structured and well maintained, meaning it’s modular and only refers to what it needs to refer to. In a lot of cases you will have variables with the same name, functions with the same name and other elements that you naturally write as you go along. That’s why the anonymous namespace (or the C style ‘static’) is so useful, and so useless if you switch to Unity Builds.

Using Is (even more) Lethal
Never use the ‘using’  keyword in a header file. It’s a standard statement and one that stands up easily. The reasons behind this are pretty clear (declaring ‘using’ in a header file forces all code including the header file to use that statement – in effect wiping out a namespace).

But using it in a .cpp file isn’t a crime. In fact it’s remarkably useful, even within the whole files scope. As a developer, I should know what I’m using in a local file, what elements I’m bringing into the file and what would/could conflict. Some people might not agree that ‘using’ at the top of a cpp file is a good idea at all, and I see their point, but on a whole file scope, it can be massively useful and as it’s localised it’s not going to pollute anything other than the file it’s in.

But in Unity Builds, using (that is not scoped within a function or type etc.) is lethal. Declaring ‘using ‘ in a cpp file makes it available throughout the code base (or more confusingly only in those files that are included after the one declaring it). Suddenly, using is everywhere. And your namespaces are now useless.

Every Change is a Rebuild
This is one of my biggest problems with Unity Builds. In well structured projects (more on that below) changing the content of a cpp file (or even a local header file) shouldn’t require everything to be rebuilt. Every. Single. Time. Yes, the unity build rebuilds are faster than doing a full build without a unity file, but doing even a fast rebuild every time will build up. Quickly.

When-ever I change a .cpp file, all I need to build is that .cpp file. Granted, link times are a tad long because it still has to re-link the other object files, but it takes a second to compile that one file. On average (I took count today) when I changed a header file it compiled on average 5 .cpp files. And it (again on average) took about 5 seconds to build.

Very rarely should I be required to re-build the entire project, and most of the time I’m don’t. And that saves me a lot of time. Every single day.

Multiple Configurations
This is mainly a bugbear of mine rather than a direct issue with Unity Builds, but I see it in nearly every Unity Build project I use. The main project is the Unity Build project, but there is another project that is built and maintained that doesn’t use Unity Builds. Now there is a point here – by having an additional ‘normal’ project you are forcing the modularity that can collapse with Unity Builds to be checked (usually a Continuous Build server will be building this as well as the Unity Build every time).

But we have problems with this.

Firstly, the non-unity build is only ever being built on the CB server. So any problems are going to break the build, and it will break if people are not using it day-to-day. Secondly you now have multiple projects to maintain. Not too much of a problem if you have an automated project generation step (see below) but it is still another project to maintain.

People may occasionally have to use the non-unity configurations, especially if they are getting an unknown error on the CB server. So now they are left working on a configuration that is uncared for and probably builds so slowly and erratically that they are probably losing all the time they saved from those quick unity builds they have been doing all day.

But What About My Build Times Then?
Well structured software builds quickly (or as quickly as they can anyway). But what is a well structured project when it comes to build times?

  • Sensible file inclusion – Only include the files you need and try as best you can to limit them to .cpp files. This means the compiler only needs to bring in what’s necessary and when you change one of these header files only those things that directly rely on it will change. If you find yourself having to include files that constantly change into a large number of cpp files, then I’d wager that a refactoring of the large header file would be in order. You should be building only a small number of cpp files when you change the content of a header file.
  • Forward Declarations – Where possible use forward declarations rather than including the header file in your class or function declaration. Annoyingly you cannot forward declare enums (on standard compliant compliers anyway) which sometimes throws this out of the window. But by not including the files until use, you’re cutting down on the amount of code being included and the number of files being opened.
  • Pre-compiled Headers – Use Pre-compiled Headers (PCH). Using PCH’s is the one (built into the IDE) feature that will cause your build times to plummet, especially if you are being sensible with them (such as including files that don’t change – SDK’s and common, fixed headers for example). Using pre-compiled headers across multiple platforms can be a pain, but it is possible to unify them and get a massive boost. I’ll cover these a little bit more below.
  • Library Files – Modular code usually leads towards easily extracting common code into a collection of libs. Reducing the amount of code in a project (and as a result how much you need to build) can speed up your productivity quickly.
  • Parallel Building – If you’re IDE supports it (and it might do and you just don’t know) and you have a multi-core machine, turn on parallel building of files. Building individual files at the same time is obviously going to cut down on your compile times no matter how quick they are at the moment.
  • Get a Better PC – It goes without saying (but I will anyway) doing this will speed everything up.

Pre-Compiled Headers
Pre-compiled headers are one of the best ways to get a massive boost to your compile times. So how fast are my build times compared to that of a Unity Build version when using PCH’s and taking into account the other build improvements suggestions?

As stated above the ‘average’ build time of a minimal build throughout the day was around 5 seconds. On a Unity Build it was a full build every time and was around 2 minutes on the slowest platform.

Changes to ‘core’ files, which are included by a higher than average number of files, resulted in builds of around 30 seconds on around 20 files. Again on a Unity Build this would have been around 2 minutes regardless.

Full rebuild (which I did twice today) was around 3 minutes. Granted a whole minute slower than a Unity Build but I did it twice rather than every single time.

Pre-compiled headers are not a silver bullet solution. Nothing is. And because of this here are issues that you do need to be aware of

  • Compiler Support – Some compilers out there simply do not support PCH’s. On a daily basis I use between 4 or 5 different compilers and while I’ve never encountered one that doesn’t, they are out there. This can shoot my argument in the foot, but it is a rare problem and one most people won’t encounter.
  • PCH Quirks – While I use a number of compilers that do support PCH’s, every single one of them has a slightly different way of building them and slightly different requirements before they can be used. This doesn’t affect the code that is written but does affect the content of your project, especially if you want to make them seem as consistent as possible.
  • Over-Inclusion – Because your PCH usually includes common files and files that rarely change, it does mean that some files are being brought into the project in a compilation unit that wouldn’t otherwise be required

Unity builds are a solution for the wrong problem. Long compile times are not caused by not using Unity Builds, they are the result of having badly structured and badly referenced code. Fix that (and have better code to boot) and you’ll be able to use ‘proper’ build tools without resorting to a quick fix.

Making Unity Builds Better
I don’t want to just leave it at that, because no matter how much I argue, Unity Builds will always exist and people will always use them (after all it’s quicker to rescue a broken down build by making it a Unity Build than doing it the hard way). So what can people do to actually make Unity Builds a bit more desirable and less of a problematic fix?

  • Automate Adding Files – A lot of teams have auto project generation code already (XML files, template projects etc.) so it’s important that you automate adding files to the project, otherwise people will forget to remove the file from the build step and they will forget to add it to the unity build file.
  • Multiple Unity Files – Running a Unity Build doesn’t require you to have one unity file with every cpp file inside it. You can have multiple build files (usually per module or per component) which means at least some of the translation unit leaking is limited to each module rather than the whole program.
  • Additional Project – No, this isn’t a contradiction from the above “Multiple Configurations” comment above. In this situation you will have a project that contains the cpp and header files so you can reference them but this project isn’t built. Instead, you have the ‘proper’ project simply contain the unity file(s). This isn’t something I’ve personally tried, but it does get around the issues of adding files if you don’t have an automated step.