Clarifying the value categories of expressions

Question

In 2010, Bjarne Stroustrup, the creator of C++, wrote the paper “New” Value Terminology in which he explains the value categories of expressions introduced in the C++11 standard* (lvalue, xvalue, and prvalue, and their generalizations glvalue and rvalue):

There were only two independent properties:

“has identity” – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.

“can be moved from” – i.e. we are allowed to leave to source of a “copy” in some indeterminate, but valid state

This led me to the conclusion that there are exactly three kinds of values (using the regex notational trick of using a capital letter to indicate a negative – I was in a hurry):

iM: has identity and cannot be moved from

im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)

Im: does not have identity and can be moved from

The fourth possibility (“IM”: doesn’t have identity and cannot be moved) is not useful in C++ (or, I think) in any other language. In addition to these three fundamental classifications of values, we have two obvious generalizations that correspond to the two independent properties:

i: has identity

m: can be moved from

In 2015, Richard Smith, then the C++ standard editor, wrote the paper Guaranteed copy elision through simplified value categories in which he explains the rewording of the value categories of expressions introduced in the C++17 standard**:

However, these rules are hard to internalize and confusing -- for instance, an expression that creates a temporary object designates an object, so why is it not an lvalue? Why is NonMoveable().arr an xvalue rather than a prvalue? This paper suggests a rewording of these rules to clarify their intent. In particular, we suggest the following definitions for glvalue and prvalue:

A glvalue is an expression whose evaluation computes the location of an object, bit-field, or function.

A prvalue is an expression whose evaluation initializes an object, bit-field, or operand of an operator, as specified by the context in which it appears.

That is: prvalues perform initialization, glvalues produce locations.

Denotationally, we have:

glvalue :: Environment -> (Environment, Location)

prvalue :: (Environment, Location) -> Environment

So far, this is not a functional change to C++; it does not change the classification of any existing expression. However, it makes it simpler to reason about why expressions are classified as they are:
struct X { int n; }; extern X x; X{4}; // prvalue: represents initialization of an X object x.n; // glvalue: represents the location of x's member n X{4}.n; // glvalue: represents the location of X{4}'s member n; // in particular, xvalue, as member is expiring 

Basically, Smith only reworded Stroustrup’s definition of a prvalue from ‘does not have identity’ to ‘performs initialization’.

I am still unclear about the following things (so these are my questions):

The meaning of Smith’s notations ‘glvalue :: Environment -> (Environment, Location)’ and ‘prvalue :: (Environment, Location) -> Environment’.
The rationale for which Smith’s expression X{4}.n is not a prvalue under the C++17 standard**, since it performs initialization of the complete object X{4} (called ‘temporary object materialization’) and in particular of its subobject n.
The rationale for which Smith’s expression X{4}.n is not a prvalue under the C++11 standard*, since it represents a subobject of a temporary object.

Notes

* The value categories of expressions in the C++11 standard, [basic.lval/1] (bold emphasis mine):

An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. [ Example: If E is an expression of pointer type, then *E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function whose return type is an lvalue reference is an lvalue. — end example ]

An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references ([dcl.ref]). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. — end example ]

A glvalue (“generalized” lvalue) is an lvalue or an xvalue.

An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a temporary object ([class.temporary]) or subobject thereof, or a value that is not associated with an object.

A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a reference is a prvalue. The value of a literal such as 12, 7.3e5, or true is also a prvalue. — end example ]

** The value categories of expressions in the C++17 standard, [basic.lval/1] (bold emphasis mine):

A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.

A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of the operand of an operator, as specified by the context in which it appears.

An xvalue is a glvalue that denotes an object or bit-field whose resources can be reused (usually because it is near the end of its lifetime). [ Example: Certain kinds of expressions involving rvalue references yield xvalues, such as a call to a function whose return type is an rvalue reference or a cast to an rvalue reference type. — end example ]

An lvalue is a glvalue that is not an xvalue.

An rvalue is a prvalue or an xvalue.

[basic.lval]/1, while normative in form, not really normative in its content and can be replace with just enumeration of "primary" value categories and how they combine into "secondary" ones with zero impact on the rest of the standard. «denotes an object or bit-field whose resources can be reused» etc. etc. is just some fluff giving your an approximate idea how such expressions are informally treated but which can't really be used to do any normative conclusions. — Language Lawyer
– Language Lawyer, Commented Sep 9, 2021 at 22:37
Re. 1, that looks like haskell syntax. My wishy-washy interpretation: glvalue is something that takes Environment, adds new object to it and returns this new Environment and the Location of that object, prvalue is something that takes the Environment and the Location, initializes the Location, which modifies the Environment, and returns this new Environment. — danadam
– danadam, Commented Sep 9, 2021 at 22:58
@danadam glvalue doesn't add a new object, it extracts an existing object's location from the environment — Language Lawyer
– Language Lawyer, Commented Sep 9, 2021 at 23:00
to focus on your example; X{4} is a prvalue by definition from expr.type.conv/2, x.n is an lvalue by definition from expr.ref/6.2, X{4}.n is an xvalue by the same expr.ref/6.2 sentence. basic.lval plays no role. — Cubbi
– Cubbi, Commented Sep 10, 2021 at 2:51
@Cubbi Thanks a lot for the precise references. I have just added a little more detail in the post (although the questions remain the same). — Géry Ogam
– Géry Ogam, Commented Sep 10, 2021 at 8:02

Davis Herring · Accepted Answer · 2021-09-10 02:38:43Z

This has been largely answered in the comments, but to elaborate: the semantics of any imperative system can be expressed without side effects by considering the state of “the world” (starting with all of RAM) as an argument to a function and as (part of) its return value. This notation indicates that evaluating a glvalue selects an address (the identity of an object) from that environment (and possibly alters it) whereas evaluating a prvalue requires such a location and alters the environment to contain an initialized object there (possibly with other side effects).
X{4}.n doesn’t initialize n (with what, itself?); it allows access to (i.e., identifies) the value established by just X{4} (which is materialized so as to have a particular n to identify).
You’re right about its temporary status, but that just makes it an rvalue; a prvalue is an rvalue that is not also an xvalue.

3. Note that X{4}.n was a prvalue in C++11, till DR616 turned it into an xvalue.
@Maggyero: n is initialized as one of the effects of evaluating X{4}.n, but it’s not initialized by the expression as a whole; it would be silly to say that std::cout << 0; is a prvalue because it initializes a sentry object. That X{4} can be seen as describing an identity is one of the main points of confusion Richard resolved: it doesn’t anymore because you can write X x=X(X{4}); and still get only one object that is conceived by the definition, not any part of the initializer.
@Maggyero: In C++11, it denotes the last of several temporaries created, one from another. In C++17, it’s all one prvalue that has yet to materialize anything.
@Maggyero: True, but that doesn’t make the (unconverted) expression an xvalue any more than 'a'+0 makes 'a' itself an int, and (in case it’s not clear) the converted xvalue isn’t used to initialize anything.
@Maggyero prvalues do not create objects at all. They initialize them. And can initialize both temporary and non-temporary objects.

Collectives™ on Stack Overflow

Clarifying the value categories of expressions

Notes

1 Answer 1

36 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Notes

1 Answer 1

36 Comments

Linked

Related