Disclaimer: This post contains a lot of code. For reasons of brevity I’ve left out several bit’s that should be used in production code (e.g. cctors, dtors etc.).
Everybody who works in embedded software will – at some point – get in contact with MISRA C. MISRA C defines a subset of the C programming language that is deemed safe enough for application in the motoring industry (as if a safe subset of C could ever exist). The MISRA standard imposes some restrictions on the use of several language features in the name of safety. Some of those are a good idea, e.g.: "don’t compare floats using ==". However there are a couple of rules I have some gripes with, the worst offender would be 15.5: "MISRA C:2012, 15.5 – A function should have a single point of exit at the end". This is probably one of the most outrageous coding rules I have ever encountered as it practically makes it impossible to have functions that are understandable and have a medium length (~ 30 – 60 LoC) at the same time. Let’s have a look at an example:
int DoCalculation()
{
int result = 0;
if (foo())
{
if (1 == fark())
{
if (42 == baz())
{
// businesslogic here
result = 1048;
}
else
{
result = 48;
}
}
else
{
result = 20;
}
}
else
{
result = 10;
}
return result;
}
Believe it or not, this is actually a rather trivial function, and it could be reduced to less than half of it’s size. If we were allowed to use early exits it would look something like this:
int v2()
{
if (!foo()) return 10;
if (1 != fark()) return 20;
if (42 != baz()) return 48;
//businesslogic here
return 1048;
}
Thats 9 lines without nesting vs. 28 lines with three levels of nesting. Now I’m not here to argue that everyone should use version 2 (but you should, seriously!), in fact there are situations where you will have to use version 1. In my case this is usually, when a client demands it, because they feel, that they might not get their safety certification otherwise. This will usually come with a trigger happy static code analyzer that will reject the easy to maintain and easy to understand version 2 (talk about cargo cult…). Combine this with a 500 k LoC codebase and you end up with an intolerable mess. So this got me to think: Can we somehow have both: Easy maintainability and MISRA compliance?
Builderpattern to the rescue!
Let’s imagine an ideal solution, our only constraint is, that only may have one return in the function. This sounds like a nice application for the builder pattern, here’s some pseudocode:
int v1()
{
ResultBuilder B;
return B.ifTrue(foo())
.ifEqual(1, bar())
.ifEqual(42, baz())
.yield(1048);
}
The desired behavior here is, that a call to a member function of B evaluates the passed condition and will either stop further evaluation there (if e.g. the condition yielded false when calling ifTrue
), or return another builder so another condition can be checked. We terminate by calling yield
which should return the given value.
Mind you, this is far from correct code, but it does show where we want to arrive. The first step would be to sort out the "else"’s, so we add another parameter to ifTrue and ifEqual:
int v2()
{
ResultBuilder B;
return B.ifTrue(foo(), 10)
.ifEqual(1, bar(), 20)
.ifEqual(42, baz(), 48)
.yield(1048);
}
Our changed semantics now are:
- ifTrue: It shall evaluate the passed condition. If it evaluates to false it shall return the second parameter and terminate. If it evaluates to true it shall continue with the next condition.
- ifEqual: Same as with ifTrue, except that we take two conditions/expressions as paramter.
- yield shall evaluate the passed expression and return the result value.
If we’d manage to get this bit of code to work, we’d crammed the whole lot of our 28 lines from the first version into something that actually rather closely resembles the easy to maintain version while still sporting only one return statement, thus making our static analysis tool happy. In order to get this to work we’ll have to sort out a couple of things first:
- ResultBuilder should be able to return ("yield") any value, not just ints as in the example
- In the example a line like
B.ifTrue(foo(), 10).ifTrue(baz(), 20)
should only executebaz()
, iffoo()
yielded true. If we were to use the code above as it is, all calls would always be executed, which is probably not what we want. This mean’s we’ll have to figure out how to lazily evaluate these statements.
Getting lazy with C++1x
C++ 11 gave us the godsent lambdas, which will allow us to easily implement lazy evaluation. Let’s have a simple start with a lazy boolean:
class LazyBool
{
public:
template<class T>
LazyBool(T f) : func(f) {};
LazyBool(bool b): func([b]() {return b; }){}
bool eval()
{
return func();
}
std::function<bool()> func;
};
Okay, what’s happening for the faint of heart? This function takes a lambda that must yield a boolean value. It stores it into a std::function object and will evaluate the lambda, when eval() is called. The other c’tor is a convenience function, so that we can also just use a regular boolean value instead of a lambda (this will come in handy in some situations). This c’tor wraps the boolean parameter into a lambda and can then treat it just like it would treat a lambda in the first c’tor.
Using it
void lazyBool()
{
LazyBool b = [&]() {return true; };
auto theValue = b.eval();
}
since C++1x also sports powerful template parameter deduction we can just assign a lambda to our LazyBool instance, which makes life very easy for us.
More lazyness
The first version above is already nice, but it is limited to booleans, we can however easily extend our LazyBool to support any old type:
template<class T>
class LazyEval
{
public:
template<class fun>
LazyEval(fun f) : func(f) {}
T eval()
{
return func();
}
std::function<T()> func;
};
As you see this is actually a little less code than the bool version, however it does lack some convenience: The compiler is no longer able to infer all template parameters for us. We have to provide T. This means, using it is a bit more cumbersome:
void lazyEval()
{
LazyEval<int> i = [&]() {return 21; };
auto theValue = i.eval();
}
Luckily, with what we have in mind, this does not concern us, as we’ll never be confronted with it directly.
Building the builder
With the lazy evalutation sorted out, we can now start working on the builder. Let’s define our interface first:
template<class T>
class ResultBuilder
{
public:
ResultBuilder<T> ifTrue(LazyBool value, LazyEval<T> failValue);
ResultBuilder<T> ifEqual(LazyEval<T> valA, LazyEval<T> valB, LazyEval<T> failValue);
T yield(LazyEval<T> val);
};
In this version our builder will have three methods:
- ifTrue shall check the returnvalue of a lazily evaluated boolean. If the bool evaluates to true, the builder should continue to evaluate later stages. If the bool evaluates to false, evaluation of later stages shall be omitted and instead a predefined "failureValue" shall be returned.
- ifEqual shall compare the results of two lazy evals and stop evaluating further if the two evals are not the same. In this case a predefined "failureValue" shall be returned.
- yield shall return the result of a lazy evaluation.
The tricky bit
As you’ve noticed, ifTrue and ifEqual both return another ResultBuilder. This represents a bit of a problem, as we actually want the possibility to return our failureValue instead. We’ll have a closer look at how we solve this problem later, but for now, let’s have a look at the simple part. Here is the implementation for ifTrue:
ResultBuilder<T> ifTrue(LazyBool value, LazyEval<T> failValue)
{
if (value.eval())
{
return ResultBuilder<T>();
}
else
{
return ResultBuilder<T>(failValue);
}
}
Pretty much straight forwart: We "eval" the lazy bool, if things go as planned we return another builder (note: we might just as well return *this
here!). If things go wrong we return a builder but pass it the failvalue as c’tor argument. This allows us to pass a failure down to the last builder, where the yield
method is called. Again, we’ll have a look at how we deal with this value a bit later. For now, let’s have a look at the first shot of our yield method:
T yield(LazyEval<T> val)
{
return val.eval();
}
This is still straight forward: If we make it to the yieldmethod, we evaluate val to get the actual returnvalue of our ResultBuilder.
Now, let’s tackle the failure value. The trick here is, to propagate this value all the way down to the last bulider instance in the chain. On this instance the client code will usually call yield
, so here we can return the failure value instead of the actuall passed value. How do we do this? Two steps, first we add a field to our builder to store the current failvalue:
std::optional<LazyEval<T>> failVal;
We’re using an optional here, to allow modeling the absence of a failvalue. Now let’s adapt yield to use the failVal:
T yield(LazyEval<T> val)
{
return failVal.value_or(val).eval();
}
Easy, right? We check if a failVal is present with "value_or". "value_or" will either return the value contained in the optional (our failvalue) or the parameter (the parameter that was originally passed to yield
). This effectively means, that a call to yield will return the failvalue, if present; otherwise it will evaluate the parameter passed to yield and return that.
Passing failure
The last piece of the puzzle that is still missing is, how to get a failValue to the last builder in the chain. I’ve hinted at this in the implementation of ifTrue
: We add an additional c’tor to pass a failvalue:
template<class T>
ResultBuilder(LazyEval<T> failVal_)
{
failVal = std::optional{ failVal_};
}
Now all that remains is making sure, that we stop evaluating stuff, whenever a failure was encountered somewhere. To do this we adjust our ifTrue
Method a bit:
ResultBuilder<T> ifTrue(LazyBool value, LazyEval<T> failValue)
{
if (failVal != std::nullopt)
{
return ResultBuilder<T>(failVal.value());
}
if (value.eval())
{
return ResultBuilder<T>();
}
else
{
return ResultBuilder<T>(failValue);
}
}
In plain english: If a failVal was passed in the c’tor, we don’t evaluate at all and just return a ResultBuilder container the same failVal.
And what does it look like?
Okay, let’s have a look at what we’ve archieved compared to what we set out to do. Here’s a reminder of the pseudocode we started with:
int v1()
{
ResultBuilder B;
return B.ifTrue(foo(), 10)
.ifEqual(1, bar(), 20)
.ifEqual(42, baz(), 48)
.yield(1048);
}
And here’s what it currently looks like:
int v3()
{
ResultBuilder<int> fc;
return fc.ifTrue([]() {return foo(); }, []() {return 10; }).
ifTrue([]() {return 1 == bar(); }, []() {return 20; }).
ifTrue([]() {return 42 != baz(); }, []() {return 48; }).
yield([]() {return 1048; });
}
yikes. Okay, it’s not all bad – this does somewhat resemble our goal, but all these []() for the lambda’s make it hard to comprehend. To get rid of them we’ll introduce two macros:
#define LAZY(x) ([&](){return x;})
#define LAZYF(x) ([&](){x})
These will turn any expression into lambdas and leave us with this:
int v4()
{
ResultBuilder<int> fc;
return fc.ifTrue(LAZY(foo()), LAZY(10)).
ifTrue(LAZY(1 == bar()), LAZY(20)).
ifTrue(LAZY(42 != baz()), LAZY(48)).
yield(LAZYF({
return 1048;
}
));
}
Now this is pretty darn close to our initial goal. Note that we actually still have two returnkeywords in the function, however return 1048
is actually not a returnstatement of the function itself but of the lamba. In the end we now have function with a cyclomatic complexity of 1 where all error handling is hidden away in the builder.
Should I use this?
It depends. While this is an interesting experiment in applying modern C++ features and design patterns in a maybe not quite obvious way the reader will have to grasp quite a few concepts before she understands what’s actually happening here. However: There is little possibility of using it wrong. I’d use it anyday if I had the choice between this solution and writing an abomination of if-statements.
Caveats
- The code presented in the builder is in itself currently not MISRA conform as it uses multiple exitpoints as well. However this is easily fixed and I leave it up to the reader to do that.
- For reasons of brevity I’ve omitted the implementation of isEqual – but this should pose no problem for anyone who has a remote understanding of C++
- The code could be improved by means of Move semantics and by reducing the number of c’tor calls.
Image by Jon Tyson