Static vs. Dynamic Polymorphism

Polymorphism is a key concept within C++. It is essential in bringing some flexibility into a safe, strongly typed language. By guaranteeing an object will respond to a certain interface, you can build code which is ambivalent to the exact implementation details of such an object. It is well known however that flexible dynamic behavior necessarily incurs a computational overhead, forcing a compromise on speed and latency. There is however the option of static polymorphism, which can give a level of reusability and flexibility to your code without the runtime performance penalty. Instead this penalty is paid at compile time, which may be a good tradeoff in many applications.

Two buildings representing a vibrant, dynamic and unplanned approach vs a strict, efficient and planned one. Generated by DALL·E 3

The need for polymorphism

When I learned C++ I had already been coding in Python for some years - a situation which I suspect will only become more common over time. I therefore might not be alone in initially seeing polymorphism quite naively as just a way of getting around the limitations of the strong typing. This, of course, was quite a simplistic way of seeing an important topic - the subtleties came to me much later.

Strong typing enables C++ to have really impressive speed, and more importantly enables it to be much safer than its weakly typed alternatives (the fact that Python programmers typically start using type hints for larger projects is no surprise!). The real beauty of polymorphism then is to carry on this concept of safety and reliability in code while introducing much needed flexibility, even if some speed is sacrificed.

This is possible (in the dynamic case) by casting an object into a base type pointer. This is equivalent to specifying an interface that the object must obey. This means a function which receives that pointer can guarantee a certain functionality the object will obey, even without knowing what the object exactly is.

In the famous Gang of Four book they start by defining an operation’s signature as the operation’s name, input types and return types. The set of all function signatures an object responds to constitutes an object’s interface. A type is simply a name we give to a particular interface (so one object might be thought of as being two types if its interface includes all operations of both types). Reading this book made me appreciate how the heart of object oriented programming is really in designing these types and interfaces with which object can reliably exchange information. The interface is the only way that one object knows to talk to another object.

Interlude: a ‘toy’ problem to play with

To illustrate the differences between static and dynamic polymorphism I want to use the Collatz conjecture as a toy problem. For a nice introduction to the conjecture I point the readers towards Veritasium’s video on the topic. For our purposes the statement is quite simple. We start with any number \(x_0\) and decide to pass it through two operations. If it is even, we use \(x_1 = x_0/2\), and if it is odd we use \(x_1 = 3 x_0 + 1\). Stated succinctly we have

\[ x_{i+1} = \cases{ x_i / 2 \hspace{1.2cm} \text{if} \; x_i = 0 \; \text{(mod 2)}\\ 3 x_i + 1 \hspace{0.5cm} \text{if} \; x_i \neq 0 \; \text{(mod 2)}\\ }. \]

The conjecture states that if this process is repeated enough times, eventually we arrive at \(x_n = 1\). This has not been proven, but is conjectured to be true, and has been tested for all values up to \(2^{68}\) as of 2020.

To make this problem a benchmark for polymorphism we will imagine a software developer who wanted to make the ultimate Collatz++ code, where any two operations can be combined in the same way. We will have a class named Collatz which calls any two functions, in a way that those two functions are easily changed - flexible and reusable behavior.

Dynamic polymorphism

We start by defining a pure abstract base class for the callable functions, which in C++ is essentially the specification of a type, or an interface.

class Callable
{
public:
    virtual ~Callable() = default;

    virtual void call(long long int&) const = 0;
};

We can create concrete classes following this interface as follows:

class three_x_plus_one : public Callable
{
public:
    void call(long long int& x) const override
    {
        x = 3 * x + 1;
    }
};

class x_divide_two : public Callable
{
public:
    void call(long long int& x) const override
    {
        x /= 2;
    }
};

Notice that we are ensuring the class remains constant from the main function call, and we are modifying the value by reference. The Collatz class then looks as follows

class Collatz
{
private:
    const long long int _initial_x;
    long long int _x;
    const std::unique_ptr<const Callable> _even_callable;
    const std::unique_ptr<const Callable> _odd_callable;
public:
    Collatz(
        long long int x,
        std::unique_ptr<const Callable> even_callable,
        std::unique_ptr<const Callable> odd_callable
        ):
        _initial_x{x},
        _x{x},
        _even_callable{std::move(even_callable)},
        _odd_callable{std::move(odd_callable)}
        {}

    void call()
    {
        while (_x > 1)
        {
            if (_x % 2 == 0)
                _even_callable->call(_x);
            else
                _odd_callable->call(_x);
        }
    }

    void reset()
    {
        _x = _initial_x;
    }
};

I am using base class pointers of type Callable, and in particular using a unique_pointer to store them. These are then called in a loop till _x reaches the expected value of 1. We will reveal the timing of this later!

The key idea here is that every time the Collatz class calls the operations it does not know about the underlying implementation of the function. All it knows is the signature: “I give it my long long int by reference and it returns nothing”. The reason this is called dynamic is because resolving what exact function is called happens at runtime. Concretely, there is a table of function pointers called a vtable that incurs an overhead.

Static polymorphism

The static polymorphism version of this code is compelling considering that at the time we compile the code we know exactly what functions we want to use - unlike an imagined example where a runtime user input decides on the functions. We write code the in the following way, which allows the compiler to know the exact objects used and therefore optimize the procedure.

template <class even, class odd>
class T_Collatz
{
private:
    const long long int _initial_x;
    long long int _x;
    const even _even_callable;
    const odd _odd_callable;

public:
    T_Collatz(
        long long int x,
        even even_callable,
        odd odd_callable):
        _initial_x{x},
        _x{x},
        _even_callable{even_callable},
        _odd_callable{odd_callable}
        {}

    void call()
    {
        while (_x > 1)
        {
            if (_x % 2 == 0)
                _even_callable.call(_x);
            else
                _odd_callable.call(_x);
        }
    }

    void reset()
    {
        _x = _initial_x;
    }
};

The class is now preceded by T_ for template, and has two template arguments, one for each function. This type of template allows for code reuse, but all of the versions of the template class are figured out at compile time. This means that if there is a clash in the function signatures of what you try to pass in, the compiler will warn you about this (warn with varying degrees of clarity…).

The main function & timing results

We use each of the Collatz classes as follows

int main()
{
    long long int x = 1'234'567'890'123'456'789;

    Collatz dynamic_caller(
        x,
        std::make_unique<x_divide_two>(),
        std::make_unique<three_x_plus_one>()
        );

    T_Collatz<x_divide_two, three_x_plus_one> static_caller(
        x,
        x_divide_two(),
        three_x_plus_one()
        );

    auto dynamic_poly = [&dynamic_caller]() {
        for (int i = 0; i < 100'000; ++i)
        {
            dynamic_caller.call();
            dynamic_caller.reset();
        }
    };

    auto static_poly = [&static_caller]() {
        for (int i = 0; i < 100'000; ++i)
        {
            static_caller.call();
            static_caller.reset();
        }
    };

    time_function(dynamic_poly);
    time_function(dynamic_poly);
    time_function(dynamic_poly);

    std::cout << std::endl;

    time_function(static_poly);
    time_function(static_poly);
    time_function(static_poly);

    return 0;
}

Notice that to use the dynamic Collatz class we need to create unique pointers which will be cast to base type Callable within the class. Meanwhile the template class takes the objects themselves.

The timing function simply takes a function without arguments and tests its speed. We therefore wrap each class’ call within a lambda to facilitate that process. Each time we iterate till \(x\rightarrow 1\), and then repeat that 100 thousand times. The starting \(x\) is one quintillion two hundred thirty-four quadrillion five hundred sixty-seven trillion eight hundred ninety billion one hundred twenty-three million four hundred fifty-six thousand seven hundred eighty-nine (honestly I was surprised at how fast this code was given the starting value!).

The result of this code was

Time: 247 ms
Time: 207 ms
Time: 214 ms

Time: 120 ms
Time: 116 ms
Time: 114 ms

The static code was around 2x faster than its dynamic counterpart. As explained in this blog, it is due to having all objects known at compile time - this could make huge differences in key applications where the extreme flexibility is not necessary.

For completeness, here is the timing function

template <typename F>
void time_function(const F& f)
{
    auto start = std::chrono::high_resolution_clock::now();
    f();
    auto end = std::chrono::high_resolution_clock::now();
    double time_taken_ms =
        std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
    std::cout << "Time: " << time_taken_ms << " ms" << std::endl;
}

Admittedly I do not know why the first of the timing functions is always slower. Changing the order of the dynamic and static tests showed it was indeed always the first run that was slower. If anyone knows then please feel free to reach out!

Main takeaway

Flexibility and reusability of code components is an important part of many coding projects. C++ allows for this without compromising on safety and (to a lesser degree) speed. In my experience, especially with scientific computing, the desired flexibility need not be delayed to runtime and can instead be specified at compile time. Furthermore, for many applications the runtime performance is absolutely essential, and the time cost of compilation is a small enough that it is an obvious tradeoff.

The concepts explained in this post allow coders to both have their proverbial cake and eat it (both fast and flexible code). To use this approach in larger projects I would recommend diving into concepts in C++ and type traits. These compile time checks are helpful in maintaining clear compiler error messages and can therefore prevent any possible slowing of development from a template approach.