...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
The implementations are implementations of well-known data structures. The queue is based on Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms by Michael Scott and Maged Michael, the stack is based on Systems programming: coping with parallelism by R. K. Treiber and the spsc_queue is considered as 'folklore' and is implemented in several open-source projects including the linux kernel. All data structures are discussed in detail in "The Art of Multiprocessor Programming" by Herlihy & Shavit.
The lock-free boost::lockfree::queue
and boost::lockfree::stack
classes are node-based data structures, based on a linked list. Memory management
of lock-free data structures is a non-trivial problem, because we need to
avoid that one thread frees an internal node, while another thread still
uses it. boost.lockfree
uses a simple approach not returning
any memory to the operating system. Instead they maintain a free-list
in order to reuse them later. This is done for two reasons: first, depending
on the implementation of the memory allocator freeing the memory may block
(so the implementation would not be lock-free anymore), and second, most
memory reclamation algorithms are patented.
The ABA problem is a common problem when implementing lock-free data structures.
The problem occurs when updating an atomic variable using a compare_exchange
operation: if the value A was read, thread 1 changes it to say C and tries
to update the variable, it uses compare_exchange
to write
C, only if the current value is A. This might be a problem if in the meanwhile
thread 2 changes the value from A to B and back to A, because thread 1 does
not observe the change of the state. The common way to avoid the ABA problem
is to associate a version counter with the value and change both atomically.
boost.lockfree
uses a tagged_ptr
helper
class which associates a pointer with an integer tag. This usually requires
a double-width compare_exchange
, which is not available
on all platforms. IA32 did not provide the cmpxchg8b
opcode
before the pentium processor and it is also lacking on many RISC architectures
like PPC. Early X86-64 processors also did not provide a cmpxchg16b
instruction. On 64bit platforms one can work around this issue, because often
not the full 64bit address space is used. On X86_64 for example, only 48bit
are used for the address, so we can use the remaining 16bit for the ABA prevention
tag. For details please consult the implementation of the boost::lockfree::detail::tagged_ptr
class.
For lock-free operations on 32bit platforms without double-width compare_exchange
,
we support a third approach: by using a fixed-sized array to store the internal
nodes we can avoid the use of 32bit pointers, but instead 16bit indices into
the array are sufficient. However this is only possible for fixed-sized data
structures, that have an upper bound of internal nodes.
The boost.lockfree
data structures have basic support
for Boost.Interprocess.
The only problem is the blocking emulation of lock-free atomics, which in
the current implementation is not guaranteed to be interprocess-safe.