...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
All instances v
of type
variant<T1,T2,...,TN>
guarantee that v
has constructed content of one of the
types Ti
, even if an operation on
v
has previously failed.
This implies that variant
may be viewed precisely as
a union of exactly its bounded types. This
"never-empty" property insulates the user from the
possibility of undefined variant
content and the
significant additional complexity-of-use attendant with such a
possibility.
While the never-empty guarantee might at first seem "obvious," it is in fact not even straightforward how to implement it in general (i.e., without unreasonably restrictive additional requirements on bounded types).
The central difficulty emerges in the details of
variant
assignment. Given two instances v1
and v2
of some concrete variant
type, there
are two distinct, fundamental cases we must consider for the assignment
v1 = v2
.
First consider the case that v1
and v2
each contains a value of the same type. Call this type T
.
In this situation, assignment is perfectly straightforward: use
T::operator=
.
However, we must also consider the case that v1
and
v2
contain values of distinct types.
Call these types T
and U
. At this point,
since variant
manages its content on the stack, the
left-hand side of the assignment (i.e., v1
) must destroy
its content so as to permit in-place copy-construction of the content
of the right-hand side (i.e., v2
). In the end, whereas
v1
began with content of type T
, it ends
with content of type U
, namely a copy of the content of
v2
.
The crux of the problem, then, is this: in the event that
copy-construction of the content of v2
fails, how can
v1
maintain its "never-empty" guarantee?
By the time copy-construction from v2
is attempted,
v1
has already destroyed its content!
Upon learning of this dilemma, clever individuals may propose the following scheme hoping to solve the problem:
memcpy
) of the
storage of the left-hand side to the backup storage.
While complicated, it appears such a scheme could provide the desired safety in a relatively efficient manner. In fact, several early iterations of the library implemented this very approach.
Unfortunately, as Dave Abraham's first noted, the scheme results in undefined behavior:
"That's a lot of code to read through, but if it's doing what I think it's doing, it's undefined behavior.
"Is the trick to move the bits for an existing object into a buffer so we can tentatively construct a new object in that memory, and later move the old bits back temporarily to destroy the old object?
"The standard does not give license to do that: only one object may have a given address at a time. See 3.8, and particularly paragraph 4."
Additionally, as close examination quickly reveals, the scheme has the potential to create irreconcilable race-conditions in concurrent environments.
Ultimately, even if the above scheme could be made to work on certain platforms with particular compilers, it is still necessary to find a portable solution.
Upon learning of the infeasibility of the above scheme, Anthony
Williams proposed in
[Wil02] a scheme that served
as the basis for a portable solution in some pre-release
implementations of variant
.
The essential idea to this scheme, which shall be referred to as
the "double storage" scheme, is to provide enough space
within a variant
to hold two separate values of any of
the bounded types.
With the secondary storage, a copy the right-hand side can be attempted without first destroying the content of the left-hand side; accordingly, the content of the left-hand side remains available in the event of an exception.
Thus, with this scheme, the variant
implementation
needs only to keep track of which storage contains the content -- and
dispatch any visitation requests, queries, etc. accordingly.
The most obvious flaw to this approach is the space overhead
incurred. Though some optimizations could be applied in special cases
to eliminate the need for double storage -- for certain bounded types
or in some cases entirely (see
the section called “Enabling Optimizations” for more
details) -- many users on the Boost mailing list strongly objected to
the use of double storage. In particular, it was noted that the
overhead of double storage would be at play at all times -- even if
assignment to variant
never occurred. For this reason
and others, a new approach was developed.
Despite the many objections to the double storage solution, it was realized that no replacement would be without drawbacks. Thus, a compromise was desired.
To this end, Dave Abrahams suggested to include the following in
the behavior specification for variant
assignment:
"variant
assignment from one type to another may
incur dynamic allocation." That is, while variant
would
continue to store its content in situ after
construction and after assignment involving identical contained types,
variant
would store its content on the heap after
assignment involving distinct contained types.
The algorithm for assignment would proceed as follows:
p
.p
to the left-hand side
storage.
Since all operations on pointers are nothrow, this scheme would allow
variant
to meet its never-empty guarantee.
The most obvious concern with this approach is that while it
certainly eliminates the space overhead of double storage, it
introduces the overhead of dynamic-allocation to variant
assignment -- not just in terms of the initial allocation but also
as a result of the continued storage of the content on the heap. While
the former problem is unavoidable, the latter problem may be avoided
with the following "temporary heap backup" technique:
backup
.backup
to the
left-hand side storage.backup
.
With this technique: 1) only a single storage is used;
2) allocation is on the heap in the long-term only if the assignment
fails; and 3) after any successful assignment,
storage within the variant
is guaranteed. For the
purposes of the initial release of the library, these characteristics
were deemed a satisfactory compromise solution.
There remain notable shortcomings, however. In particular, there
may be some users for which heap allocation must be avoided at all
costs; for other users, any allocation may need to occur via a
user-supplied allocator. These issues will be addressed in the future
(see the section called “Future Direction: Policy-based Implementation”). For now,
though, the library treats storage of its content as an implementation
detail. Nonetheless, as described in the next section, there
are certain things the user can do to ensure the
greatest efficiency for variant
instances (see
the section called “Enabling Optimizations” for
details).
As described in
the section called “The Implementation Problem”, the central
difficulty in implementing the never-empty guarantee is the
possibility of failed copy-construction during variant
assignment. Yet types with nothrow copy constructors clearly never
face this possibility. Similarly, if one of the bounded types of the
variant
is nothrow default-constructible, then such a
type could be used as a safe "fallback" type in the event of
failed copy construction.
Accordingly, variant
is designed to enable the
following optimizations once the following criteria on its bounded
types are met:
T
that is nothrow
copy-constructible (as indicated by
boost::has_nothrow_copy
), the
library guarantees variant
will use only single
storage and in-place construction for T
.boost::has_nothrow_constructor
),
the library guarantees variant
will use only single
storage and in-place construction for every
bounded type in the variant
. Note, however, that in
the event of assignment failure, an unspecified nothrow
default-constructible bounded type will be default-constructed in
the left-hand side operand so as to preserve the never-empty
guarantee.
Implementation Note: So as to make
the behavior of variant
more predictable in the aftermath
of an exception, the current implementation prefers to default-construct
boost::blank
if specified as a
bounded type instead of other nothrow default-constructible bounded
types. (If this is deemed to be a useful feature, it will become part
of the specification for variant
; otherwise, it may be
obsoleted. Please provide feedback to the Boost mailing list.)
As the previous sections have demonstrated, much effort has been
expended in an attempt to provide a balance between performance, data
size, and heap usage. Further, significant optimizations may be
enabled in variant
on the basis of certain traits of its
bounded types.
However, there will be some users for whom the chosen compromise
is unsatisfactory (e.g.: heap allocation must be avoided at all costs;
if heap allocation is used, custom allocators must be used; etc.). For
this reason, a future version of the library will support a
policy-based implementation of variant
. While this will
not eliminate the problems described in the previous sections, it will
allow the decisions regarding tradeoffs to be decided by the user
rather than the library designers.