...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
STL and most other containers value initialize new elements in common operations
like vector::resize(size_type n)
or explicit
vector::vector(size_type n)
.
In some performance-sensitive environments, where vectors are used as a replacement for variable-size buffers for file or network operations, value initialization is a cost that is not negligible as elements are going to be overwritten by an external source shortly after new elements are added to the container.
Boost.Container offers two new members for
vector
, static_vector
and stable_vector
: explicit container::container(size_type n, default_init_t)
and container::resize(size_type n, default_init_t)
, where new elements are constructed using
default
initialization.
When filling associative containers big performance gains can be achieved
if the input range to be inserted is guaranteed by the user to be ordered
according to the predicate. This can happen when inserting values from a
set
to a multiset
or between different associative container families ([multi]set/map
vs. flat_[multi]set/map
).
Boost.Container has some overloads for constructors
and insertions taking an ordered_unique_range_t
or an ordered_range_t
tag
parameters as the first argument. When an ordered_unique_range_t
overload is used, the user notifies the container that the input range is
ordered according to the container predicate and has no duplicates. When
an ordered_range_t
overload
is used, the user notifies the container that the input range is ordered
according to the container predicate but it might have duplicates. With this
information, the container can avoid multiple predicate calls and improve
insertion times.
set
, multiset
,
map
and multimap
associative containers are implemented as binary search trees which offer
the needed complexity and stability guarantees required by the C++ standard
for associative containers.
Boost.Container offers the possibility to
configure at compile time some parameters of the binary search tree implementation.
This configuration is passed as the last template parameter and defined using
the utility class tree_assoc_options
.
The following parameters can be configured:
tree_type
).
By default these containers use a red-black tree but the user can use
other tree types:
optimize_size
).
By default this option is activated and is only meaningful to red-black
and avl trees (in other cases, this option will be ignored). This option
will try to put rebalancing metadata inside the "parent" pointer
of the node if the pointer type has enough alignment. Usually, due to
alignment issues, the metadata uses the size of a pointer yielding to
four pointer size overhead per node, whereas activating this option usually
leads to 3 pointer size overhead. Although some mask operations must
be performed to extract data from this special "parent" pointer,
in several systems this option also improves performance due to the improved
cache usage produced by the node size reduction.
See the following example to see how tree_assoc_options
can be used to customize these containers:
#include <boost/container/set.hpp> #include <cassert> int main () { using namespace boost::container; //First define several options // //This option specifies an AVL tree based associative container typedef tree_assoc_options< tree_type<avl_tree> >::type AVLTree; //This option specifies an AVL tree based associative container //disabling node size optimization. typedef tree_assoc_options< tree_type<avl_tree> , optimize_size<false> >::type AVLTreeNoSizeOpt; //This option specifies an Splay tree based associative container typedef tree_assoc_options< tree_type<splay_tree> >::type SplayTree; //Now define new tree-based associative containers // //AVLTree based set container typedef set<int, std::less<int>, std::allocator<int>, AVLTree> AvlSet; //AVLTree based set container without size optimization typedef set<int, std::less<int>, std::allocator<int>, AVLTreeNoSizeOpt> AvlSetNoSizeOpt; //Splay tree based multiset container typedef multiset<int, std::less<int>, std::allocator<int>, SplayTree> SplayMultiset; //Use them // AvlSet avl_set; avl_set.insert(0); assert(avl_set.find(0) != avl_set.end()); AvlSetNoSizeOpt avl_set_no_szopt; avl_set_no_szopt.insert(1); avl_set_no_szopt.insert(1); assert(avl_set_no_szopt.count(1) == 1); SplayMultiset splay_mset; splay_mset.insert(2); splay_mset.insert(2); assert(splay_mset.count(2) == 2); return 0; }
In the first C++ standard list::size()
was not required to be constant-time, and
that caused some controversy in the C++ community. Quoting Howard Hinnant's
On
List Size paper:
There is a considerable debate on whether
std::list<T>::size()
should be O(1) or O(N). The usual argument notes that it is a tradeoff with:
splice(iterator position, list& x, iterator first, iterator last);
If size() is O(1) and this != &x, then this method must perform a linear operation so that it can adjust the size member in each list
C++11 definitely required size()
to be O(1), so range splice became O(N).
However, Howard Hinnant's paper proposed a new splice
overload so that even O(1) list:size()
implementations could achieve O(1) range splice when the range size was known
to the caller:
void splice(iterator position, list& x, iterator first, iterator last, size_type n);
Effects: Inserts elements in the range [first, last) before position and removes the elements from x.
Requires: [first, last) is a valid range in x. The result is undefined if position is an iterator in the range [first, last). Invalidates only the iterators and references to the spliced elements. n == distance(first, last).
Throws: Nothing.
Complexity: Constant time.
This new splice signature allows the client to pass the distance of the input range in. This information is often available at the call site. If it is passed in, then the operation is constant time, even with an O(1) size.
Boost.Container implements this overload
for list
and a modified version
of it for slist
(as slist::size()
is also O(1)
).
Many C++ programmers have ever wondered where does good old realloc fit in
C++. And that's a good question. Could we improve vector
performance using memory expansion mechanisms to avoid too many copies? But
vector
is not the only
container that could benefit from an improved allocator interface: we could
take advantage of the insertion of multiple elements in list
using a burst allocation mechanism that could amortize costs (mutex locks,
free memory searches...) that can't be amortized when using single node allocation
strategies.
These improvements require extending the STL allocator interface and use make use of a new general purpose allocator since new and delete don't offer expansion and burst capabilities.
allocator
,
adaptive_pool
and node_allocator
classes.
allocator
: This
extended allocator offers expansion, shrink-in place and burst allocation
capabilities implemented as a thin wrapper around the modified DLMalloc.
It can be used with all containers and it should be the default choice
when the programmer wants to use extended allocator capabilities.
node_allocator
:
It's a Simple
Segregated Storage allocator, similar to Boost.Pool
that takes advantage of the modified DLMalloc burst interface. It does
not return memory to the DLMalloc allocator (and thus, to the system),
unless explicitly requested. It does offer a very small memory overhead
so it's suitable for node containers ([boost::container::list list],
[boost::container::slist slist] [boost::container::set set]...) that
allocate very small value_type
s
and it offers improved node allocation times for single node allocations
with respecto to allocator
.
adaptive_pool
:
It's a low-overhead node allocator that can return memory to the system.
The overhead can be very low (< 5% for small nodes) and it's nearly
as fast as node_allocator
.
It's also suitable for node containers.
Use them simply specifying the new allocator in the corresponding template argument of your favourite container:
#include <boost/container/vector.hpp> #include <boost/container/flat_set.hpp> #include <boost/container/list.hpp> #include <boost/container/set.hpp> //"allocator" is a general purpose allocator that can reallocate //memory, something useful for vector and flat associative containers #include <boost/container/allocator.hpp> //"adaptive_pool" is a node allocator, specially suited for //node-based containers #include <boost/container/adaptive_pool.hpp> int main () { using namespace boost::container; //A vector that can reallocate memory to implement faster insertions vector<int, allocator<int> > extended_alloc_vector; //A flat set that can reallocate memory to implement faster insertions flat_set<int, std::less<int>, allocator<int> > extended_alloc_flat_set; //A list that can manages nodes to implement faster //range insertions and deletions list<int, adaptive_pool<int> > extended_alloc_list; //A set that can recycle nodes to implement faster //range insertions and deletions set<int, std::less<int>, adaptive_pool<int> > extended_alloc_set; //Now user them as always extended_alloc_vector.push_back(0); extended_alloc_flat_set.insert(0); extended_alloc_list.push_back(0); extended_alloc_set.insert(0); //... return 0; }
The document C++ Extensions for Library Fundamentals (Final draft: N4480) includes classes that provide allocator type erasure and runtime polymorphism. As Pablo Halpern, the author of the proposal, explains in the paper (N3916 Polymorphic Memory Resources (r2)):
“A significant impediment to effective memory management in C++ has been the inability to use allocators in non-generic contexts. In large software systems, most of the application program consists of non-generic procedural or object-oriented code that is compiled once and linked many times.”
“Allocators in C++, however, have historically relied solely on compile-time polymorphism, and therefore have not been suitable for use in vocabulary types, which are passed through interfaces between separately-compiled modules, because the allocator type necessarily affects the type of the object that uses it. This proposal builds upon the improvements made to allocators in C++11 and describes a set of facilities for runtime polymorphic memory resources that interoperate with the existing compile-time polymorphic allocators.”
Boost.Container implements nearly all classes
of the proposal under the namespace boost::container::pmr
.
There are two groups,
polymorphic_allocator
.
monotonic_buffer_resource
.
unsynchronized_pool_resource
.
synchronized_pool_resource
.
get_default_resource
/
set_default_resource
/
new_delete_resource
/
null_memory_resource
pmr::vector
,
etc.)
Boost.Container's polymorphic resource library is usable from C++03 containers, and offers some alternative utilities if the required C++11 features of the Library Fundamentals specification are not available.
Let's review the usage example given in N3916 and see how it can be implemented using Boost.Container: Suppose we are processing a series of shopping lists, where a shopping list is a container of strings, and storing them in a collection (a list) of shopping lists. Each shopping list being processed uses a bounded amount of memory that is needed for a short period of time, while the collection of shopping lists uses an unbounded amount of memory and will exist for a longer period of time. For efficiency, we can use a more time-efficient memory allocator based on a finite buffer for the temporary shopping lists.
Let's see how ShoppingList
can be defined to support an polymorphic memory resource that can allocate
memory from different underlying mechanisms. The most important details are:
allocator_type
typedef. This allocator_type
will be of type memory_resource *
,
which is a base class for polymorphic resources.
ShoppingList
has
constructors taking memory_resource*
as the last argument.
ShoppingList
has
constructors taking allocator_arg_t
as the first argument and memory_resource*
as the second argument.
Note: In C++03 compilers, it is
required that the programmer specializes as true
constructible_with_allocator_suffix
or constructible_with_allocator_prefix
as in C++03 there is no way to automatically detect the chosen option at
compile time. If no specialization is done, Boost.Container
assumes the suffix option.
//ShoppingList.hpp #include <boost/container/pmr/vector.hpp> #include <boost/container/pmr/string.hpp> class ShoppingList { // A vector of strings using polymorphic allocators. Every element // of the vector will use the same allocator as the vector itself. boost::container::pmr::vector_of <boost::container::pmr::string>::type m_strvec; //Alternatively in compilers that support template aliases: // boost::container::pmr::vector<boost::container::pmr::string> m_strvec; public: // This makes uses_allocator<ShoppingList, memory_resource*>::value true typedef boost::container::pmr::memory_resource* allocator_type; // If the allocator is not specified, "m_strvec" uses pmr::get_default_resource(). explicit ShoppingList(allocator_type alloc = 0) : m_strvec(alloc) {} // Copy constructor. As allocator is not specified, // "m_strvec" uses pmr::get_default_resource(). ShoppingList(const ShoppingList& other) : m_strvec(other.m_strvec) {} // Copy construct using the given memory_resource. ShoppingList(const ShoppingList& other, allocator_type a) : m_strvec(other.m_strvec, a) {} allocator_type get_allocator() const { return m_strvec.get_allocator().resource(); } void add_item(const char *item) { m_strvec.emplace_back(item); } //... };
However, this time-efficient allocator is not appropriate for the longer lived collection of shopping lists. This example shows how those temporary shopping lists, using a time-efficient allocator, can be used to populate the long lived collection of shopping lists, using a general purpose allocator, something that would be annoyingly difficult without the polymorphic allocators.
In Boost.Container for the time-efficient
allocation we can use monotonic_buffer_resource
,
providing an external buffer that will be used until it's exhausted. In the
default configuration, when the buffer is exhausted, the default memory resource
will be used instead.
#include "ShoppingList.hpp" #include <cassert> #include <boost/container/pmr/list.hpp> #include <boost/container/pmr/monotonic_buffer_resource.hpp> void processShoppingList(const ShoppingList&) { /**/ } int main() { using namespace boost::container; //All memory needed by folder and its contained objects will //be allocated from the default memory resource (usually new/delete) pmr::list_of<ShoppingList>::type folder; // Default allocator resource //Alternatively in compilers that support template aliases: // boost::container::pmr::list<ShoppingList> folder; { char buffer[1024]; pmr::monotonic_buffer_resource buf_rsrc(&buffer, 1024); //All memory needed by temporaryShoppingList will be allocated //from the local buffer (speeds up "processShoppingList") ShoppingList temporaryShoppingList(&buf_rsrc); assert(&buf_rsrc == temporaryShoppingList.get_allocator()); //list nodes, and strings "salt" and "pepper" will be allocated //in the stack thanks to "monotonic_buffer_resource". temporaryShoppingList.add_item("salt"); temporaryShoppingList.add_item("pepper"); //... //All modifications and additions to "temporaryShoppingList" //will use memory from "buffer" until it's exhausted. processShoppingList(temporaryShoppingList); //Processing done, now insert it in "folder", //which uses the default memory resource folder.push_back(temporaryShoppingList); assert(pmr::get_default_resource() == folder.back().get_allocator()); //temporaryShoppingList, buf_rsrc, and buffer go out of scope } return 0; }
Notice that the shopping lists within folder
use the default allocator resource whereas the shopping list temporaryShoppingList
uses the short-lived
but very fast buf_rsrc
. Despite
using different allocators, you can insert temporaryShoppingList
into folder because they have the same ShoppingList
type. Also, while ShoppingList
uses memory_resource directly, pmr::list
,
pmr::vector
and
pmr::string
all
use polymorphic_allocator
.
The resource passed to the ShoppingList
constructor is propagated to the vector and each string within that ShoppingList
. Similarly, the resource used
to construct folder
is propagated
to the constructors of the ShoppingLists that are inserted into the list
(and to the strings within those ShoppingLists
).
The polymorphic_allocator
template is designed to be almost interchangeable with a pointer to memory_resource
,
thus producing a bridge between the template-policy
style of allocator and the polymorphic-base-class style of allocator.
This example actually shows how easy is to use Boost.Container to write type-erasured allocator-capable classes even in C++03 compilers.