2024-12-01
integration into IS ISO/IEC 9899:202y
document number | date | comment |
---|---|---|
n3392 | 202411 | Original proposal |
n3415 | 202412 | Detailed argumentation, add recommended practice |
CC BY, see https://creativecommons.org/licenses/by/4.0
To qualify if an array is a VLA or not, the status of the array length expression of the array declarator is important. If it is an integer constant expression, the array is not a VLA and all its properties are determined at translation time. Otherwise, it is a VLA and the type expression is evaluated whenever size information is needed. In particular, type expressions of VM types where the array length expression has side effects may (or may not) be evaluated each time it is reached during execution, and thus the side effects may or may not take place.
We think that this feature is actually a bug of the specification. It seams that real world examples where side effects appear in array length expressions are rare and most often erroneous. There is a particular code pattern that may add hidden modifications to state, that is when the array length expression is a macro invocation or a function call, and when the extent to which side effects in these appear are not properly mastered.
Additionally, array length expressions with side effects have a specific caveat: evaluation order. In the following
the evaluations of the two ++i
expressions are
unsequenced, and thus the behavior of that declaration is undefined. If
such a side effect is hidden in a macro or function call
not much can be said about the array by inspecting the declaration
without prior knowledge of dimension
:
dimension
is a macro, something
like dimension(1)
could resolve to an integer constant expression and then the array is
then of known constant size. Otherwise it is a VLA.Unfortunately, the following features are often confused
Note that to determine if a type is a VM type or not, in general the distinction between link time and execution is not relevant, so in the following we will not distinguish these two cases. We are left with the following cascade of properties of array length expressions:
The goal of this papers is constrain the definitions of array declarator such that the last case never happens, but the cursor where and how to make the change has yet to be determined.
The proposed changes are normative.
The idea is to restrict possible array length expressions already syntactically as far as that is possible. The technique is similar to the one already used for “constant expression”. Namely the term is derived from “conditional expression” and then constrained further as necessary:
volatile
lvalue, so
ban them, too.Only the last point cannot always be detected at translation time; the called function may be the result of the evaluation of a modifiable function pointer, and may thus change each time the function call expression is reached. Thus this last requirement cannot easily be expressed as a constraint and possibly leads to undefined behavior if we just ban “functions with side effects”.
In the following we will assume that we would reach consensus for the first points above, and that we only have to solve the problems of whether or not we want to allow function calls, and, if so, if we want to restrict these function calls in any way, syntactically or semantically.
There is a multitude of choices that could be made to improve the situation.
The first option has the following properties:
volatile
lvalues.dimension
above a macro or a function?In essence this is an implementation-friendly but user-hostile option.
For the decision between the other possibilities, we note that none of these are exclusive in time: variant 3. could later be strengthened to variant 2., and a semantic feature could later be captured by syntax.
For this proposal, we choose the option that is the least restrictive for both, implementations and users. Namely we chose variant 3. allowing function calls that are reproducible, but leave the responsibility to check them where it is now, namely on the user side. In addition to banning direct side effects, the difference to the situation as it was before is then that implementations have guidance on the behavior that is expected, and that they may start to diagnose suspicious behavior where they can.
So we restrict the possible function calls to the properties that are
collected for the [[reproducible]]
attribute, namely of being effectless, idempotent and not having pointer
parameters that give access mutable state. This is the minimum
combination of features that is necessary for the desired
properties:
const
-qualified base
type ensures that not even modifications to state that is passed into
the function may occur.New text is underlined green, removed text is
stroke-out red.
Add a new clause 6.5’ before 6.6 (Constant expressions)
6.5’ Reproducible expressions
Syntax
reproducible-expression: conditional-expression
Description
A reproducible expression can be evaluated in any place without changing the observable program state.
Constraints
A reproducible expression shall not be or contain
- an assignment operator,
- an increment operator,
- a decrement operator,
- a conversion of an lvalue with
volatile
type.
Semantics
If a reproducible expression is evaluated and contains a function call expression, the called function shall be effectless and idempotent and no object that is pointed to by an argument of the call shall be modified.FNT)
FNT) That is, the function pointer expression of the call can be converted in place to a function pointer type with an[[reproducible]]
attribute and where all pointer parameters, if any, arerestrict
-qualified and have aconst
-qualified base type without changing the semantics of the program.
Recommended practice
In contexts that require reproducible expressions with function calls, it is recommended to use functions that are annotated with[[reproducible]]
and that have pointer parameters withconst
-qualified base types. Where this is possible, it is recommended that implementations diagnose if a function call in a reproducible expression is not effectless, not idempotent or if it modifies an object referred to by one of the arguments to the call.
Replace the grammar term assignment-expression
used in
by reproducible-expression.
If n3414 is accepted concurrently, the additions of the word “assignment” there should instead read “reproducible”.