You are here

The status of reflection in C++

When the C++ committee met in Jacksonville two months ago, something big happened: the reflection study group, SG7, decided what the basic “language" of reflected C++ should look like. What does that mean? Why do you care? Let me, the co-author of the only “blessed proposal", explain:

Almost everyone agrees that C++ needs a facility to query C++ code itself: types, functions, data members etc. And that this facility should be a compile time facility, at least as a start. But what should it look like?

Several proposals were on the table over the last few years that SG 7 existed; in Jacksonville those were N4428, P0194 and P0255. Here are the main distinguishing features, and SG7's recommendation:

How to get reflection data

Two major paths to query an entity (a base-level construct) were proposed: operators or templates. Templates need to obey the one-definition-rule (ODR); any recurrence must be exactly the same as the previous "invocations". They do not allow to test for "progress" within a translation unit: do we have a definition? Do we have a definition now? And now? For template-based reflection, the answer must always be the same.

But even more importantly, C++ only allows certain kinds of identifiers to be passed as templates arguments. Namespaces, for instance, are not among them. There must be no visible difference between passing a typedef or its underlying type as a template parameter, making it impossible to reflect namespaces or typedefs, or requiring language changes for the sake of reflection.

Operators, on the other hand, are a natural way to extend the language. They do not suffer from any such limitation. Additionally, they signal clearly that the code is reflected, making code review simpler.

Traits versus aggregates

How should reflection data be served? Some proposals were based on structure-like entities. Code could use members on them to drill into the reflection data.

This meant that the compiler needs to generate these types for each access. The objects could be passed around, they would need to have associated storage, at least at compile-time.

The alternative is an extension of the traits system. Here, the compiler needs to generate only data that is actually queried. It was also deemed simpler to extend, once reflection wants to support a more complete feature set, or once reflection wants to cover new language features.

Traits on meta or traits on code?

These traits can be applied on the C++ code itself, as done for the regular C++ type traits, possibly with filters to specify query details. Or, and that is the main distinguishing feature of P0194, an operator can "lift" you onto the meta-level, and reflection traits operate only on that meta level.

P0194

Meta-objects are of a meta-type that describes the available interfaces (meta-functions). All of that can be mapped into regular C++ these days, with some definition of “these days": meta-objects are types; they are unnamed and cannot be constructed; they are generated by the reflection operator, for instance reflexpr(std::string). Meta-functions are templates that "take" a meta-object and "return" a constexpr value or a different meta-object, for instance get_scope. And the big step for the Jacksonville-revision P0194R0 of the proposal has happened for the meta-types: they are now mapped to C++ concepts! That is obvious, natural and makes the proposal even simpler and even more beautiful.

Reflection-types described by concepts

You can query for instance the type property of a meta-object, using get_type. But not all meta-objects have a type; it would not make sense to call that on the meta-object of a namespace. The meta-object (remember, a type) must be of a certain kind: it must implement the requirements of the meta::Typed concept. The type returned by reflexpr(std) does not satisfy these requirements. Easy. For each meta-type (concept) there exists a test whether a meta-object (that all satisfy the meta::Object concept, by definition) is of that meta-type, i.e. satisfies the concept. For instance, get_type is only valid on those meta-objects for which has_typed_v<meta::Object> is true.

Reflection language versus Reflection library

P0194 proposes the basic ingredients to query reflection in C++. You might find it too basic or too complex. We use it to lay the first few miles of the train track, to agree on the design and specify the “language" used. Once we have that, extending it to become a full C++ reflection library is much simpler than providing a complete feature set and defending the design against ten other proposals in parallel. Matus, the original author, has already shown that P0194 is extensible. Like mad.

And now?

Jacksonville was a big step: SG7 agrees on the recommended design. Now we need to agree on the content. For instance, should reflection distinguish typedefs and their underlying type? Take

struct ArrayRef {
  using index_type = size_t;
  using rank_type = size_t;
  rank_type rank_;
};

Should reflection see the type of rank_ being unsigned long or rank_type? The former is how the compiler understands the code (“semantic” reflection), the latter is what the developer wrote (“syntactic” reflection). We are collecting arguments; I know of lots of smart people with convincing arguments for each one of these options.

Matus is currently writing the next revision. He will split the paper: a short one with the wording, and a discussion paper that explains the design decisions of SG7 - a sort of log, collecting the arguments for those who want to know why C++ reflection ends up the way P0194 proposes. The design paper will also contain examples of use cases, for instance a JSON serializer and likely a hash generator. Can you implement you favorite reflection use-case with P0194’s interfaces?

Cheers, Axel.

Comments

Mikhail,

short answer; this version very probably won't. It was slapped together very quickly and it has several shortcomings. If nobody else picks this up, the plan is that I'll probably start writing a new implementation from scratch during the summer.

Hello Matus and Axel!

Do you guys have any news to share about reflection? How is new implementation going, is it started?

Thanks!

Hi Mikhail,

That's here

I don't think he expects this to be merged. It was meant to serve as a demonstration that the proposal is feasible implementation-wise. A reality-check. I remember Matus saying that clang should be able to do a much better (i.e. efficient) job.

Cheers, Axel.

Think of typedefs seen vs no seen as a parameter/option. Think of it as the most obvious example of "lowering"

Reflection is a much needed functionality in C++ on my personal wishlist. Other than that I can name only default operator== (already proposed AFAIK) and enforced "override" keyword (already implemented as a warning in clang).

We use C++ reflection for data persistence. We currently use circa 2003 solution based on Microsoft SBR format and SBR SDK. Needless to say, it does not work in XCode or Qt/NDK. So now there is hope we can get a standard way to reflect on struct data member names/types and list struct base classes..

In the ArrRef example, I think the type of rank_ should be returned by 2 functions one say get_type() which returns the type defined by developer I.e. rank_type, the second functions say get_underlying_type() should return unsigned long as understood by the compiler.

Hi Muhammad,

I think that's fairly close to what Bjarne suggests. The main point here is that both of you believe that it should be possible to identify rank_type whereas others (in the committee) do not want reflection to be able to see a typedef. Intentionally. The argument I heard most often is that detecting a typedef will make something a distinct entity that C++ treats as identities. (My counter-argument so far is "yes, and?")

Cheers, Axel.

Yes and I'd like to do that for the same reason that I want to be able to distinguish between different types of Enums. The alternative in some cases is probably to start using Hungarian Notation and prefixing variables with the type again, please don't make me do that. :(
If I'm accessing an external system which has a money type but I want to simply access and display that data, being able to distinguish between the money and just a double type would save needing some other way to distinguish the user type. That's a trivial case, but I'm sure there are others. I haven't used reflection in C++ since MFC., although it's heavily used in other languages.

Ralph

If that's true, the committee seems to be making assumptions about how people would use reflection (a common issue w/ design-by-committee).

Reflection has uses beyond semantic analyis (which would be the absolute minimum one would expect in a reflection API, but most certainly not the peak).

It's just as likely people will NEED syntactic analysis. One very basic use case that comes to mind (assuming compile time reflection is constexpr - which it needs to be) would be implementing custom compile time errors (linting) of domain specific rules w/ reflection and static assertions at the syntactic level that INDEED treat some typedefs as distinct entities.

Example: In some code bases typedefs are absolutely intended to be used as distinct entities (not just an alias) and will break if that typedef changes in in another configuration (as is often the intention, else why typedef?), hence static assertion on the syntax is just as important (if not more so, due to the domain specific knowledge often encoded in syntax).

One could argue in such cases that typedef should be an actual type, however it's very common in C++ for people to typedef primitives (int/float/etc) and use them as if they're a distinct entity (writing code in ways that would break if the underlying type that entity ever changed, potentially without compilation failure (due to implicit casting - hence the need for linting))

At the moments its `reflexpr(rank_type)` = Meta-Typedef vs. `get_aliased_t<reflexpr(rank_type)>` = Meta-Type. To me adding a separate operator for the second case looks like an overkill.

When it comes to what's returned as the result for typedefs, and classes, I would imagine the same functionality being used on class deduction so inheritance should be a major player. If you have an object that would satisfy the usual diamond problem examples, some sort of structured result would have to be returned.

something as simple as
struct typeinfo{ typeid id; /* other properties... */ vector<typeinfo> nodes; };

If just returning a single type, it seems you'd have to always lean toward the front-most class unless compiler's context is clearly referencing a base/baser class/type. Otherwise you may have ambiguous types at the same level.

Is there any particular reason one shouldn't be able to discover the typedef AND the underlying format, perhaps by a second query on the definition of the typedef? The distinction may make little difference now, but unless it never will, offering the option (along with future portability advice) should cover all concerns, unless the cost is inordinate. For example, which would best support a really universal yet lightweight serialization library?

Hi,

I personally agree. But playing the devil's advocate, "because we can" is not a good reason to offer a feature to the world. So what we really need are good use cases that motivate the need. That's what I was fishing for :-)

Cheers, Axel.

i like the idea of re-querying for underlying type. you could recursively get down to the primary types and it would be simple to handle multiple inheritance with 1D array return.

I think it is unquestionable that getting the backing type of a typedef is useful and probably what you'd want to see a significant amount of time when reflecting. I'd rather avoid libraries of meta functions that bake things down into whether or not a specific type resolves to an int or not, as an example.

The other case (i.e. getting the forward type of the typedef), I believe is also useful, and something I wish that templates could do as well. I was recently working on a system to gather up the fields of various data structures and present corresponding UI to the user so they could edit the fields of those structures easily. One of the things I would have liked to do was give the data structure designers ways to annotate the fields to make things easier to edit on the user side. For example, say have an int and I want to be able to annotate a lower and upper bound so that the corresponding UI is a slider instead of a text input. I want to do something like this

template <int lower, int upper>
using r_bounded_int = int;

r_bounded_int<0, 50> m_value_that_goes_from_0_to_50;

It would be nice to be able to glean this information from the type instead of doing silly things like wrapping primitive types in classes or side loading the annotation through some other variadic template mechanism.

I look through Matus's evolving proposals, and it's definitely moving in the right
direction. The most recent is simple and powerful.

My only concern is that it's now linked with concepts.

You should definitely look at:

Static reflection Rationale, design and evolution

www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0385r0.pdf

The proposal for runtime access to reflection information looks promising... but what I really want is the ability to have the compiler externalize (export) the reflection database so that external tools can easily consume it (from a standardized format). All sorts of code generators could benefit from this information greatly, and doing it externally allows for using superior tools than trying to build impossible-to-understand template/etc based C++ machinery to do it. This can be used to generate language bindings, serdes for data structures, etc.

Add new comment