Performance

Container definitions
Insertion tests
Processing tests

(Testing program at perf.cpp.)

We ran tests to measure the performance of the containers of Boost.PolyCollection in two scenarios:

Insertion of elements.
Linear traversal and processing with std::for_each and boost::poly_collection::for_each (with and without type restitution).

As a comparison baseline we used containers and facilities from the standard library and Boost (details below). Tests were run on a Windows 10 machine with 8GB RAM and an Intel Core i5-8265U CPU @1.60GHz (base frequency) for the following environments:

Visual Studio 2022 x64: Visual Studio 2022 in 64-bit (x64), release mode
GCC 13.2 x64: GCC 13.2.0 for MinGW, -O3 -DNDEBUG
Clang 13.0 x64: Clang 13.0.1 for Visual Studio (clang-cl) in 64-bit (x64), release mode

Container definitions

boost::base_collection
boost::function_collection
boost::any_collection
boost::variant_collection

`boost::base_collection`

Baseline container: ptr_vector = boost::ptr_vector<base>
Polymorphic collection: base_collection = boost::base_collection<base>
Element types: T1 = derived1, T2 = derived2, T3 = derived3

struct base
{
  virtual ~base()=default;
  virtual int operator()(int)const=0;
};

struct derived1 final:base
{
  derived1(int n):n{n}{}
  virtual int operator()(int)const{return n;}

  int n;
};

struct derived2 final:base
{
  derived2(int n):n{n}{}
  virtual int operator()(int x)const{return x*n;}

  int unused,n;
};

struct derived3 final:base
{
  derived3(int n):n{n}{}
  virtual int operator()(int x)const{return x*x*n;}

  int unused,n;
};

`boost::function_collection`

Baseline container: func_vector = std::vector<std::function<int(int)>>
Polymorphic collection: function_collection = boost::function_collection<int(int)>
Element types: T1 = concrete1, T2 = concrete2, T3 = concrete3

struct concrete1
{
  concrete1(int n):n{n}{}
  int operator()(int)const{return n;}

  int n;
};

struct concrete2
{
  concrete2(int n):n{n}{}
  int operator()(int x)const{return x*n;}

  int unused,n;
};

struct concrete3
{
  concrete3(int n):n{n}{}
  int operator()(int x)const{return x*x*n;}

  int unused,n;
};

`boost::any_collection`

Baseline container: any_vector = std::vector<boost::type_erasure::any<concept_>>
Polymorphic collection: any_collection = boost::any_collection<concept_>
Element types: T1 = int, T2 = double, T3 = char

using concept_=boost::mpl::vector<
  boost::type_erasure::copy_constructible<>,
  boost::type_erasure::relaxed,
  boost::type_erasure::typeid_<>,
  boost::type_erasure::incrementable<>
>;

`boost::variant_collection`

Baseline container: variant_vector = std::vector<std::variant<int,double,char>>
Polymorphic collection: variant_collection = boost::variant_collection<boost::mp11::mp_list<int,double,char>>
Element types: T1 = int, T2 = double, T3 = char

Insertion tests

Results for boost::base_collection
Results for boost::function_collection
Results for boost::any_collection
Results for boost::variant_collection

Tests measure the time taken to insert n elements (n between 10² and 10⁷) from a source of values with types randomly selected from T1, T2 and T3 with equal probability.

No reserve operation is done before insertion. The figures show resulting times in nanoseconds/element. The horizontal axis is logarithmic.

Results for `boost::base_collection`

Visual Studio 2022 x64
GCC 13.2 x64
Clang 13.0 x64

Visual Studio 2022 x64

Insertion, Visual Studio 2022 x64

GCC 13.2 x64

Insertion, GCC 13.2 x64

Clang 13.0 x64

Insertion, Clang 13.0 x64

Results for `boost::function_collection`

Visual Studio 2022 x64
GCC 13.2 x64
Clang 13.0 x64

Visual Studio 2022 x64

Insertion, Visual Studio 2022 x64

GCC 13.2 x64

Insertion, GCC 13.2 x64

Clang 13.0 x64

Insertion, Clang 13.0 x64

Results for `boost::any_collection`

Visual Studio 2022 x64
GCC 13.2 x64
Clang 13.0 x64

Visual Studio 2022 x64

Insertion, Visual Studio 2022 x64

GCC 13.2 x64

Insertion, GCC 13.2 x64

Clang 13.0 x64

Insertion, Clang 13.0 x64

Results for `boost::variant_collection`

Visual Studio 2022 x64
GCC 13.2 x64
Clang 13.0 x64

Visual Studio 2022 x64

Insertion, Visual Studio 2022 x64

GCC 13.2 x64

Insertion, GCC 13.2 x64

Clang 13.0 x64

Insertion, Clang 13.0 x64

Processing tests

Results for boost::base_collection
Results for boost::function_collection
Results for boost::any_collection
Results for boost::variant_collection

Tests measure the time taken to traverse a container of size n (n between 10² and 10⁷) and execute an operation on each of its elements. The operation for boost::base_collection and boost::function_collection (and the associated baseline containers) is defined as

struct for_each_callable
{
  for_each_callable():res{0}{}

  template<typename T>
  void operator()(T& x){
    res+=x(2);
  }

  int res;
};

whereas for boost::any_collection we use

struct for_each_incrementable
{
  for_each_incrementable():res{0}{}

  template<typename T>
  void operator()(T& x){
    ++x;
    ++res;
  }

  int res;
};

and for boost::variant_collection we have

struct for_each_alternative
{
  for_each_alternative():res{0}{}

  template<template<typename...> class V,typename... Ts>
  void operator()(V<Ts...>& x){
    visit(*this,x);
  }

  template<typename T>
  void operator()(T& x){
    ++x;
    ++res;
  }

  int res;
};

The baseline container is tested with three different setups:

Directly as initialized by the process described for the insertion tests. As elements are ordered according to their construction time, certain degree of memory contiguity is expected.
With an extra post-insertion stage by which elements are sorted according to their type. This increases branch prediction efficiency at the expense of having worse cache friendliness.
With an extra post-insertion stage that randomly shuffles all the elements in the container. This is the worst possible scenario both in terms of caching and branch prediction. We don't exercise this setup for variant_vector as it is at equivalent to the direct case (std::variant does not allocate dynamic memory).

As for the polymorphic collection, three variations are measured:

With std::for_each (the same as the baseline container).
Using boost::poly_collection::for_each.
Using boost::poly_collection::for_each with type restitution of T1, T2 and T3.