...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Behavior Change: proto::and_<>
In Boost 1.44, the behavior of proto::and_<>
as a transform changed. Previously, it only applied the transform associated
with the last grammar in the set. Now, it applies all the transforms but
only returns the result of the last. That makes it behave like C++'s comma
operator. For example, a grammar such as:
proto::and_< G0, G1, G2 >
when evaluated with an expression e
now behaves like this:
(G0()(e), G1()(e), G2()(e))
Behavior Change: proto::as_expr() and proto::as_child()
The functions proto::as_expr()
and proto::as_child()
are used to guarantee that an object is a Proto expression by turning it
into one if it is not already, using an optionally specified domain. In previous
releases, when these functions were passed a Proto expression in a domain
different to the one specified, they would apply the specified domain's generator,
resulting in a twice-wrapped expression. This behavior was surprising to
some users.
The new behavior of these two functions is to always leave Proto expressions alone, regardless of the expressions' domains.
Behavior Change: proto::(pod_)generator<> and proto::basic_expr<>
Users familiar with Proto's extension mechanism have probably used either
proto::generator<>
or proto::pod_generator<>
with a wrapper template when defining their domain. In the past, Proto would
instantiate your wrapper template with instances of proto::expr<>
.
In Boost 1.44, Proto now instantiates your wrapper template with instances
of a new type: proto::basic_expr<>
.
For instance:
// An expression wrapper template<class Expr> struct my_expr_wrapper; // A domain struct my_domain : proto::domain< proto::generator< my_expr_wrapper > > {}; template<class Expr> struct my_expr_wrapper : proto::extends<Expr, my_expr_wrapper<Expr>, my_domain> { // Before 1.44, Expr was an instance of proto::expr<> // In 1.44, Expr is an instance of proto::basic_expr<> };
The motivation for this change was to improve compile times. proto::expr<>
is an expensive type to instantiate because it defines a host of member functions.
When defining your own expression wrapper, the instance of proto::expr<>
sits as a hidden data member function in your wrapper and the members of
proto::expr<>
go unused. Therefore,
the cost of those member functions is wasted. In contrast, proto::basic_expr<>
is a very lightweight type with no member functions at all.
The vast majority of programs should recompile without any source changes.
However, if somewhere you are assuming that you will be given instances specifically
of proto::expr<>
, your code will break.
New Feature: Sub-domains
In Boost 1.44, Proto introduces an important new feature called "sub-domains".
This gives you a way to spcify that one domain is compatible with another
such that expressions in one domain can be freely mixed with expressions
in another. You can define one domain to be the sub-domain of another by
using the third template parameter of proto::domain<>
.
For instance:
// Not shown: define some expression // generators genA and genB struct A : proto::domain< genA, proto::_ > {}; // Define a domain B that is the sub-domain // of domain A. struct B : proto::domain< genB, proto::_, A > {};
Expressions in domains A
and B
can have different
wrappers (hence, different interfaces), but they can be combined into larger
expressions. Without a sub-domain relationship, this would have been an error.
The domain of the resulting expression in this case would be A
.
The complete description of sub-domains can be found in the reference sections
for proto::domain<>
and proto::deduce_domain
.
New Feature: Domain-specific as_expr() and as_child()
Proto has always allowed users to customize expressions post-hoc by specifying a Generator when defining their domain. But it has never allowed users to control how Proto assembles sub-expressions in the first place. As of Boost 1.44, users now have this power.
Users defining their own domain can now specify how proto::as_expr()
and proto::as_child()
work in their domain. They
can do this easily by defining nested class templates named as_expr
and/or as_child
within their domain class.
For example:
struct my_domain : proto::domain< my_generator > { typedef proto::domain< my_generator > base_domain; // For my_domain, as_child does the same as // what as_expr does by default. template<class T> struct as_child : base_domain::as_expr<T> {}; };
In the above example, my_domain::as_child<>
simply defers to proto::domain::as_expr<>
. This has the nice effect of causing
all terminals to be captured by value instead of by reference, and to likewise
store child expressions by value. The result is that expressions in my_domain
are safe to store in auto
variables because they will not have
dangling references to intermediate temporary expressions. (Naturally, it
also means that expression construction has extra runtime overhead of copying
that the compiler may or may not be able to optimize away.)
In Boost 1.43, the recommended usage of proto::extends<>
changed slightly. The new usage looks like this:
// my_expr is an expression extension of the Expr parameter template<typename Expr> struct my_expr : proto::extends<Expr, my_expr<Expr>, my_domain> { my_expr(Expr const &expr = Expr()) : proto::extends<Expr, my_expr, my_domain>(expr) {} // NEW: use the following macro to bring // proto::extends::operator= into scope. BOOST_PROTO_EXTENDS_USING_ASSIGN(my_expr) };
The new thing is the use of the
macro. To allow assignment operators to build expression trees, BOOST_PROTO_EXTENDS_USING_ASSIGN
()proto::extends<>
overloads the assignment
operator. However, for the my_expr
template, the compiler generates a default copy assignment operator that
hides the ones in proto::extends<>
. This is often not desired
(although it depends on the syntax you want to allow).
Previously, the recommended usage was to do this:
// my_expr is an expression extension of the Expr parameter template<typename Expr> struct my_expr : proto::extends<Expr, my_expr<Expr>, my_domain> { my_expr(Expr const &expr = Expr()) : proto::extends<Expr, my_expr, my_domain>(expr) {} // OLD: don't do it like this anymore. using proto::extends<Expr, my_expr, my_domain>::operator=; };
While this works in the majority of cases, it still doesn't suppress the
implicit generation of the default assignment operator. As a result, expressions
of the form a =
b
could either build an expression
template or do a copy assignment depending on whether the types of a
and b
happen to be the same. That can lead to subtle bugs, so the behavior was
changed.
The
brings into scope the assignment operators defined in BOOST_PROTO_EXTENDS_USING_ASSIGN
()proto::extends<>
as well as suppresses the generation of the copy assignment operator.
Also note that the proto::literal<>
class template, which
uses proto::extends<>
, has been chaged to use
.
The implications are highlighted in the sample code below:
BOOST_PROTO_EXTENDS_USING_ASSIGN
()
proto::literal<int> a(1), b(2); // two non-const proto literals proto::literal<int> const c(3); // a const proto literal a = b; // No-op. Builds an expression tree and discards it. // Same behavior in 1.42 and 1.43. a = c; // CHANGE! In 1.42, this performed copy assignment, causing // a's value to change to 3. In 1.43, the behavior is now // the same as above: build and discard an expression tree.
Boost 1.44: Proto gets sub-domains and per-domain control of proto::as_expr()
and proto::as_child()
to meet the needs
of Phoenix3.
Proto v4 is merged to Boost trunk with more powerful transform protocol.
Proto is accepted into Boost.
Proto's Boost review begins.
Boost.Proto v3 brings separation of grammars and transforms and a "round" lambda syntax for defining transforms in-place.
Boost.Xpressive is ported from Proto compilers to Proto transforms. Support for old Proto compilers is dropped.
Preliminary submission of Proto to Boost.
The idea for transforms that decorate grammar rules is born in a private email discussion with Joel de Guzman and Hartmut Kaiser. The first transforms are committed to CVS 5 days later on December 16.
The idea for proto::matches<>
and the whole grammar facility
is hatched during a discussion with Hartmut Kaiser on the spirit-devel
list. The first version of proto::matches<>
is checked into CVS 3 days
later. Message is here.
Proto is reborn, this time with a uniform expression types that are POD. Announcement is here.
Proto is born as a major refactorization of Boost.Xpressive's meta-programming. Proto offers expression types, operator overloads and "compilers", an early formulation of what later became transforms. Announcement is here.
Proto expression types are PODs (Plain Old Data), and do not have constructors. They are brace-initialized, as follows:
terminal<int>::type const _i = {1};
The reason is so that expression objects like _i
above can be statically initialized. Why is static
initialization important? The terminals of many domain- specific embedded
languages are likely to be global const objects, like _1
and _2
from the Boost Lambda
Library. Were these object to require run-time initialization, it might
be possible to use these objects before they are initialized. That would
be bad. Statically initialized objects cannot be misused that way.
Anyone who has peeked at Proto's source code has probably wondered, "Why all the dirty preprocessor gunk? Couldn't this have been all implemented cleanly on top of libraries like MPL and Fusion?" The answer is that Proto could have been implemented this way, and in fact was at one point. The problem is that template metaprogramming (TMP) makes for longer compile times. As a foundation upon which other TMP-heavy libraries will be built, Proto itself should be as lightweight as possible. That is achieved by prefering preprocessor metaprogramming to template metaprogramming. Expanding a macro is far more efficient than instantiating a template. In some cases, the "clean" version takes 10x longer to compile than the "dirty" version.
The "clean and slow" version of Proto can still be found at http://svn.boost.org/svn/boost/branches/proto/v3. Anyone who is interested can download it and verify that it is, in fact, unusably slow to compile. Note that this branch's development was abandoned, and it does not conform exactly with Proto's current interface.
Much has already been written about dispatching on type traits using SFINAE
(Substitution Failure Is Not An Error) techniques in C++. There is a Boost
library, Boost.Enable_if, to make the technique idiomatic. Proto dispatches
on type traits extensively, but it doesn't use enable_if<>
very often. Rather, it dispatches
based on the presence or absence of nested types, often typedefs for void.
Consider the implementation of is_expr<>
. It could have been written as
something like this:
template<typename T> struct is_expr : is_base_and_derived<proto::some_expr_base, T> {};
Rather, it is implemented as this:
template<typename T, typename Void = void> struct is_expr : mpl::false_ {}; template<typename T> struct is_expr<T, typename T::proto_is_expr_> : mpl::true_ {};
This relies on the fact that the specialization will be preferred if T
has a nested proto_is_expr_
that is a typedef for void
.
All Proto expression types have such a nested typedef.
Why does Proto do it this way? The reason is because, after running extensive
benchmarks while trying to improve compile times, I have found that this
approach compiles faster. It requires exactly one template instantiation.
The other approach requires at least 2: is_expr<>
and is_base_and_derived<>
, plus whatever templates is_base_and_derived<>
may instantiate.
In several places, Proto needs to know whether or not a function object
Fun
can be called with
certain parameters and take a fallback action if not. This happens in
proto::callable_context<>
and in the proto::call<>
transform. How does
Proto know? It involves some tricky metaprogramming. Here's how.
Another way of framing the question is by trying to implement the following
can_be_called<>
Boolean metafunction, which checks to see if a function object Fun
can be called with parameters of
type A
and B
:
template<typename Fun, typename A, typename B> struct can_be_called;
First, we define the following dont_care
struct, which has an implicit conversion from anything. And not just any
implicit conversion; it has a ellipsis conversion, which is the worst possible
conversion for the purposes of overload resolution:
struct dont_care { dont_care(...); };
We also need some private type known only to us with an overloaded comma operator (!), and some functions that detect the presence of this type and return types with different sizes, as follows:
struct private_type { private_type const &operator,(int) const; }; typedef char yes_type; // sizeof(yes_type) == 1 typedef char (&no_type)[2]; // sizeof(no_type) == 2 template<typename T> no_type is_private_type(T const &); yes_type is_private_type(private_type const &);
Next, we implement a binary function object wrapper with a very strange conversion operator, whose meaning will become clear later.
template<typename Fun> struct funwrap2 : Fun { funwrap2(); typedef private_type const &(*pointer_to_function)(dont_care, dont_care); operator pointer_to_function() const; };
With all of these bits and pieces, we can implement can_be_called<>
as follows:
template<typename Fun, typename A, typename B> struct can_be_called { static funwrap2<Fun> &fun; static A &a; static B &b; static bool const value = ( sizeof(no_type) == sizeof(is_private_type( (fun(a,b), 0) )) ); typedef mpl::bool_<value> type; };
The idea is to make it so that fun(a,b)
will
always compile by adding our own binary function overload, but doing it
in such a way that we can detect whether our overload was selected or not.
And we rig it so that our overload is selected if there is really no better
option. What follows is a description of how can_be_called<>
works.
We wrap Fun
in a type that
has an implicit conversion to a pointer to a binary function. An object
fun
of class type can be
invoked as fun(a, b)
if it has such a conversion operator,
but since it involves a user-defined conversion operator, it is less preferred
than an overloaded operator()
, which requires no such conversion.
The function pointer can accept any two arguments by virtue of the dont_care
type. The conversion sequence
for each argument is guaranteed to be the worst possible conversion sequence:
an implicit conversion through an ellipsis, and a user-defined conversion
to dont_care
. In total,
it means that funwrap2<Fun>()(a, b)
will always compile, but it will select our overload only if there really
is no better option.
If there is a better option --- for example if Fun
has an overloaded function call operator such as void
operator()(A a, B b)
---
then fun(a, b)
will resolve to that one instead. The
question now is how to detect which function got picked by overload resolution.
Notice how fun(a, b)
appears in can_be_called<>
: (fun(a, b), 0)
.
Why do we use the comma operator there? The reason is because we are using
this expression as the argument to a function. If the return type of fun(a, b)
is void
,
it cannot legally be used as an argument to a function. The comma operator
sidesteps the issue.
This should also make plain the purpose of the overloaded comma operator
in private_type
. The return
type of the pointer to function is private_type
.
If overload resolution selects our overload, then the type of (fun(a,
b),
0)
is private_type
. Otherwise,
it is int
. That fact is used
to dispatch to either overload of is_private_type()
, which encodes its answer in the size
of its return type.
That's how it works with binary functions. Now repeat the above process for functions up to some predefined function arity, and you're done.
I'd like to thank Joel de Guzman and Hartmut Kaiser for being willing to take a chance on using Proto for their work on Spirit-2 and Karma when Proto was little more than a vision. Their requirements and feedback have been indespensable.
Thanks to Daniel James for providing a patch to remove the dependence on deprecated configuration macros for C++0x features.
Thanks to Dave Abrahams for an especially detailed review, and for making a VM with msvc-7.1 available so I could track down portability issues on that compiler.
Many thanks to Daniel Wallin who first implemented the code used to find the common domain among a set, accounting for super- and sub-domains. Thanks also to Jeremiah Willcock, John Bytheway and Krishna Achuthan who offered alternate solutions to this tricky programming problem.
Thanks also to the developers of PETE. I found many good ideas there.