A templated static allocator

A common problem in embedded programming is that we need to ensure, that there is always memory available for certain actions. Since we’re often not allowed to use dynamic allocations for fear of memory fragmentation (MISRA C 2004/12, MISRA C++ 2008) we usually use a construct like a memory pool, where we preallocate memory in certain blocksizes. Common C implementations rely on raw byte arrays and are hard to configure without using dynamic memory.

In part 2 of the “memorysafe C++” series we will develop a memory pool that is entirely configurable at compiletime (thus it will fail to compile if it does not fit into memory) and memorysafe at the same time. We’ll make some use of templates and also use a nice little property of templates to make our life a lot easier.

Requirements

  • We want to allocate blocks of memory of a known size from a previously reserved chunk of memory.
  • It should be possible to configure the maximum number of blocks as well as the blocksize at compiletime.
  • We want to be able to use multiple pools, each with a different configuration.

Believe it or not, we can satisfy these requirement with less than 100 lines of C++ and actually be completely memorysafe at compiletime.

A simple C++ solution

template<class T, size_t n>
class MemoryPool
{
	public: 
		MemoryPool()
		{
			memset(data, 0x00, sizeof(data));
		};

		std::optional<T*> allocate()
		{
			for (size_t i = 0; i < n; i++)
			{
				size_t offset = ElementSize * i;
				if (data[offset] == 0)
				{
					data[offset] = 0xFF;		// Mark block as used
					return reinterpret_cast<T*>(&data[offset + 1]);
				}
			}
			return  {};
		};

		void free(T* block)
		{
			//calculate offset into dataarray
			int32_t blockoffset = reinterpret_cast<uint8_t*>(block) - data;	
			if (blockoffset < 0 || blockoffset > sizeof(data))
			{
				return;
			}

			// clear markerbyte for this block:
			data[blockoffset - 1] = 0x00;

		};

	private:
		static const size_t ElementSize = sizeof(T) + 1;
		uint8_t data[n * ElementSize];
};

This is pretty much straight-forward stuff for anyone who has dabbled with some C++ in the past. However it already has some very nice properties that fulfill most of our requirements:

  • The object has a size that is known at compiletime, as the “data” array is part of the object’s data.
  • We can avoid the use of void* pointers and instead have a typesafe Pool, where each element is guaranteed to have the correct size for the type of data we want to store.
  • We can easily have a rich diversity of different pools with different layouts, all the while retaining the fact that we never need to use dynamic memory at all.

While this is all nice, the implementation has some problems that need fixing:

  • The markerbyte is placed along with the data. Now, while the compiler will make sure, that any datastruct will usually be padded so have a size that is a multiple of the processor’s dataword’s size, and thus ensuring correct alignment, our addad databyte kills any good alignment, possibly resulting in performance issues and other nastynesses.
  • The markerbyte is rather large, especially given, that we only want to have a flag.
  • While the pool returns a T* it does not actually return a fully constructed object, but only a pointer to a bit of memory where we could *theoretically* construct an object.

Fixing the alignment issues

Since we know that the compiler will usually pad structs for us, so that sequential arrays of the same struct don’t create alignmentproblems, fixing the alignmentissues boils down to storing our “used” marker for each block seperate from the block. This can be trivially archieved by using a std::bitset.

As the name implies, the bitset allows us to store an arbitrary number of bits. Changing our pool to use a bitset we end up with the following code

template<class T, size_t n>
class MemoryPoolNoAlign
{
public:
	MemoryPoolNoAlign()
	{
		memset(data, 0x00, sizeof(data));
	};

	std::optional<T*> allocate()
	{
		for (size_t i = 0; i < n; i++)
		{
			size_t offset = ElementSize * i;
			if (!usedElements[i])
			{
				usedElements[i] = true;		// Mark block as used
				return reinterpret_cast<T*>(&data[offset]);
			}
		}
		return  {};
	};

	void free(T* block)
	{
		//calculate offset into dataarray
		int32_t blockoffset = reinterpret_cast<uint8_t*>(block) - data;
		if (blockoffset < 0 || blockoffset > sizeof(data))
		{
			return;
		}

		uint32_t elementIndex = blockoffset / sizeof(T);

		// clear markerbyte for this block:
		usedElements[elementIndex] =false;

	};

private:
	static const size_t ElementSize = sizeof(T);
	uint8_t data[n * ElementSize];
	std::bitset<n> usedElements;
};

In Place construction & variardic templates

As mentioned before it would be nice, if the pool was able to construct the object it contains, instead of just holding memory. This is simple, if the classes in question have trivial (read parameterless!) constructors, but what to do, if we want to construct any object using an abritrary constructor? Variardic Templates to the rescue! This new meachnism allows us to define templates, that take a variable number of template parameters and can hand them on, e.g. to make a constructor call. Coupled with the in-place new/”placement new”, we can easily change our implementation to actually allocate our object for us.

In Place Construction

The placement new operator takes an address, a typename additional constructorparameters. Let’s have a first version, of the pool, that can construct objects with trivial C’tors:

	std::optional<T*> allocateAndConstruct()
	{
		if (auto mem = allocate())
		{
			auto ptr = *mem;
			return new (ptr) T();
		}
		return {};
	}

Note, that we will need a matching deallocation function to call the D’tor in place:

void free(T* block)
{
    //calculate offset into dataarray, note that this check makes the
    // function nullptr safe as well!
    int32_t blockoffset = reinterpret_cast<uint8_t*>(block) - data;
    if (blockoffset < 0 || blockoffset > sizeof(data))
    {
        return;
    }

    uint32_t elementIndex = blockoffset / sizeof(T);

    // clear markerbyte for this block:
    usedElements[elementIndex] =false;
    // Explicitly call the d'tor to allow freeing
    // of other resources.
    block->~T();
};

Calling the D’tor explicitly allows us to free any resources, that were acquired by the object that is to be deallocated (e.g. release locks or similar).

Using a variardic template to use any constructor

Making our allocateAndConstruct function take a variable number of parameters is actually trivial:

	template<typename... Args>
	std::optional<T*> allocateAndConstruct(Args... args)
	{
		if (auto mem = allocate())
		{
			auto ptr = *mem;
			return new (ptr) T(args...);
		}
		return {};
	}

Using it

class Foo
{
public: 
	Foo(int i, float f): _i(i), _f(f)
	{}

	int _i;
	float _f;
};

int main()
{
	MemoryPoolNoAlign<Foo, 2> fooPool;
	auto f0 = fooPool.allocateAndConstruct(10, 25.0f);
	auto f1 = fooPool.allocateAndConstruct(1048, 3.141f);

	// Allocation error -> yields nullopt:
	if (auto f3 = fooPool.allocateAndConstruct(1048, 3.141f))
	{
		// should never be reached, since pool is full.
	}
	else
	{
		fooPool.free(*f0);
		assert(f3 = fooPool.allocateAndConstruct(1048, 3.141f));
	}
}

Final Thoughts

The final bit of code sports about 70 LoC, and does satisfy all the requirements we started out with. At the sametime the code needs no dynamic allocation whatsoever. The use of some common C++ features helped us to tackle a very common problem in embedded programming in a safe and concise manner.

The code is available on GitHub.

[Update, 28.9.2019]: I missed a mistake in the final codebits that have now been corrected. The GitHub version has been updated as well.

Image Credit: Marvin Meyer

Leave a Reply

Your email address will not be published. Required fields are marked *