constexpr coroutines

This paper is proposing making coroutines functional during constant evaluation. Even when most of use-cases for coroutines are based on I/O and event based, coroutines are still useful for compile-time based computation, eg. std::generator.

Changes

Timeline

EWG in Wroclaw

Poll: P3367 constexpr coroutines send to CWG and LEWG for inclusion for C++26
SFFNASA
412300

Result: Consensus

LEWG in Wroclaw

Poll: Forward P3367R1 to LWG for C++26
SFFNASA
125000

Result: Unanimous consent

CWG in Wroclaw

Multiple implementers expressed concerns about resources needed for the implementation of this feature, where the resources might be better spent implementing C++ features where market demand is higher.

EWG is requested to reconsider putting this feature into the standard.

Note: CWG did review of the wording and only question was about what to do with allocation / deallocation functions. Wording was updated to elide these calls during constant evaluation of a coroutine. Concern about resources was reflected by updating the paper to contain informations about alternative implementation strategies.

Based on anecdotal evidence people don't use coroutines for two reasons: one is missing standard library support (which is partially resolved with std::generator) and second is mutual exclusivity with much more popular constant evaluated code. Having constant evaluation of couroutines will help alleviate this and I do expect seeing libraries implementing lazy coroutine based parsers.

Quote

Well, you just told me coroutines are the best way to solve some problems, so wouldn't I also want to use the Best Way at compile time? (quote from Jason Turner, co-author of "constexpr all the things" talk)

Simple example

When this paper is merged into the standard, users will be able to use std::generator-like coroutines to generate or calculate data.

template <typename T> constexpr auto fib() -> std::generator<T> {
    T a = 0;
    T b = 1;
    co_yield a;
    do {
        co_yield b;
        auto tmp = b;
        b += a;
        a = tmp;
    } while (true);
}

template <typename T, size_t N> constexpr auto calculate_fibonnaci() {
    auto res = std::array<T, N>{};
    std::ranges::copy(fib<T>() | std::views::take(N), res.begin());
    return res;
}

constexpr auto cached_fibonnaci = calculate_fibonnaci<unsigned, 20>();

Evaluation of constexpr coroutines

Implementation must make sure to avoid stack exhaustion and store evaluation state and coroutine's local variables in a way to avoid it. By simply resuming coroutine and evaluating it on system stack (in case of AST walking implementations) will lead to it.

A B C E suspend D Y X F X Y A B suspend D ? ? F suspend non-coroutine code suspended coroutine code non-coroutine code coroutine code coroutine systemstack (unrelated) suspend ret ret ret suspend suspend

This is because coroutines can be interlieved and their state must be maintained to be resumed. The stack then contains mixed coroutines and normal code together. To avoid this situation the easiest way to model a coroutine is to use coroutine (or coroutine-like functionality) to store the state (represented with values on stack) somewhere else. Because AST walk is unbounded, obvious first choice is a stackfull coroutine (fiber, not a thread!).

Lifetime of coroutine is bounded to be only within constant evaluation similary as memory allocation. Any coroutine leaking outside boundaries of constant evaluation means whole constant evalution failed.

suspend X Y ? ? D F suspend A B suspend non-coroutine code non-coroutine code switch stack systemstack stackfullcoroutine (unrelated)

Implementation experience

Partially implemented in clang available on my github, implementation should be ready for its presentation at Wroclaw meeting, and also will be soon available on compiler explorer (thanks Matt!).

Most of functionality needed was already present in Clang, it was mostly about removing checks in the parser.

Another part was implementing the functionality in Clang's interpreter and there I needed to add fibers (stackfull coroutines) as the interpreter recursive walks over AST. Ability to save interpreter's stack content did minimize impact of the change to only resuming, suspending, and variable storage and life-time management.

At the end of evaluation the interpret needs to check objects holding fibers if there is still any coroutine not released, if there is it report similar error as when there is an unreleased memory allocation.

Hardest problem was implementing local "stack", as createLocal function was designed around idea of having only one branch of evaluation. This I solved by providing context of currently evaluated coroutine in EvalInfo and switching it on every suspension / resume of a coroutine.

Alternative implementation approaches

The implementation is using fibers (not threads!) to create a storage for execution of coroutines outside of main stack. Main requirement here is to make sure jumping between coroutines won't run out of the system stack.

Following subpart of paper shows alternative approaches which can model coroutines. All of them have same expressive ability and model coroutines well. These implementation has various advantages and costs. For every model of constant evaluation should be possible found good and usable match.

Byte-code interpreter

Clang implementation is starting to use a byte-code virtual machine based interpreter. In such implementation the implementation approach is identical for actually compiling coroutines to a native code. Only thing you need is ability to change current stack / or reference coroutine variables based on offset to coroutine frame.

Such interpret has many advantages over AST walking one mostly in speed and security. I was told initial measurements in the byte-code interpreter in clang are three orders of magnitude faster over recursive AST walking. Security is inherit to fact of not using system stack for the evaluation at all.

C++ coroutines

For compilers which are using C++20 code itself another implementation strategy is to not use the system stack for walking AST, but change all necessory functions into coroutines which would model ordinary functions with only one return point to its callee. But these functions can be suspended and evaluation can switch to evaluation of a constexpr coroutine. Because all coroutines will be allocated on heap, this also hardens implementation's constexpr evaluator.

(unused)systemstack non-coroutine code a s y m m e t r i c t r a n s f e r evalation order coroutine code symmetric transfer X Y ? ? D F A B suspend suspend

AST transformation / split

Last feasible approach is to implement a transformation of coroutine function, into multiple trees, representing all possible paths which can be taken from each suspension point. Everytime such splitted coroutine is suspended, constexpr evaluator can unroll all its recursive functions. Implementation then needs to implement a storage for coroutine local variables which survive suspension. Hardest part is to implement special AST nodes to represent expressions split for suspension in middle.

A B C E D F A B C F E D ret ret ret ret ret ret suspend suspend suspend suspend suspend suspend suspend coroutine transform

Note: similar split is already done in compiler (for Clang at LLVM level, GCC just before leaving frontend, for Swift in its frontend)

Comparison

ImplementationStorage forArea needed to be changed
VariablesState
Stackfull
coroutines
overlay over local variablesfiber (alternative stack on a heap)builtins to control coroutine flow, execution state to keep track of allocations
Byte-codecoroutine frameinstruction pointercontrol of instruction pointer
Stackless
coroutines
overlay over local variablessuspended coroutinesconvert all AST walking functions into coroutines
AST transformationcoroutine framesplitted AST treesadd special AST nodes to glue expressions across suspensions

Impact on existing code

None, this is a pure extension, it allows code to be constexpr which wasn't case before.

Intention for wording changes

Remove all obstacles blocking coroutines from being constant evaluatable. Make sure all coroutines are destroyed at end of constant evaluation.

Proposed wording changes

7.7 Constant expressions [expr.const]

An expression E is a core constant expression unless the evaluation of E, following the rules of the abstract machine ([intro.execution]), would evaluate one of the following:

9.2.6 The constexpr and consteval specifiers [dcl.constexpr]

The constexpr specifier shall be applied only to the definition of a variable or variable template or the declaration of a function or function template.
The consteval specifier shall be applied only to the declaration of a function or function template.
A function or static data member declared with the constexpr or consteval specifier on its first declaration is implicitly an inline function or variable ([dcl.inline]).
If any declaration of a function or function template has a constexpr or consteval specifier, then all its declarations shall contain the same specifier.
[Note 1: 
An explicit specialization can differ from the template declaration with respect to the constexpr or consteval specifier.
— end note]
[Note 2: 
Function parameters cannot be declared constexpr.
— end note]
[Example 1: constexpr void square(int &x); // OK, declaration constexpr int bufsz = 1024; // OK, definition constexpr struct pixel { // error: pixel is a type int x; int y; constexpr pixel(int); // OK, declaration }; constexpr pixel::pixel(int a) : x(a), y(x) // OK, definition { square(x); } constexpr pixel small(2); // error: square not defined, so small(2) // not constant ([expr.const]) so constexpr not satisfied constexpr void square(int &x) { // OK, definition x *= x; } constexpr pixel large(4); // OK, square defined int next(constexpr int x) { // error: not for parameters return x + 1; } extern constexpr int memsz; // error: not a definition — end example]
A constexpr or consteval specifier used in the declaration of a function declares that function to be a constexpr function.
[Note 3: 
A function or constructor declared with the consteval specifier is an immediate function ([expr.const]).
— end note]
A destructor, an allocation function, or a deallocation function shall not be declared with the consteval specifier.
A function is constexpr-suitable unless if
  • it is not a coroutine ([dcl.fct.def.coroutine]), and
  • if the function it is a constructor or destructor whose class has its class does not have any virtual base classes.
Except for instantiated constexpr functions, non-templated constexpr functions shall be constexpr-suitable.
[Example 2: constexpr int square(int x) { return x * x; } // OK constexpr long long_max() { return 2147483647; } // OK constexpr int abs(int x) { if (x < 0) x = -x; return x; // OK } constexpr int constant_non_42(int n) { // OK if (n == 42) { static int value = n; return value; } return n; } constexpr int uninit() { struct { int a; } s; return s.a; // error: uninitialized read of s.a } constexpr int prev(int x) { return --x; } // OK constexpr int g(int x, int n) { // OK int r = 1; while (--n > 0) r *= x; return r; } — end example]
An invocation of a constexpr function in a given context produces the same result as an invocation of an equivalent non-constexpr function in the same context in all respects except that
[Note 4: 
Declaring a function constexpr can change whether an expression is a constant expression.
This can indirectly cause calls to std​::​is_constant_evaluated within an invocation of the function to produce a different value.
— end note]
[Note 5: 
It is possible to write a constexpr function for which no invocation satisfies the requirements of a core constant expression.
— end note]
The constexpr and consteval specifiers have no effect on the type of a constexpr function.
[Example 3: constexpr int bar(int x, int y) // OK { return x + y + x*y; } // ... int bar(int x, int y) // error: redefinition of bar { return x * 2 + 3 * y; } — end example]
A constexpr specifier used in an object declaration declares the object as const.
Such an object shall have literal type and shall be initialized.
In any constexpr variable declaration, the full-expression of the initialization shall be a constant expression ([expr.const]).
A constexpr variable that is an object, as well as any temporary to which a constexpr reference is bound, shall have constant destruction.
[Example 4: struct pixel { int x, y; }; constexpr pixel ur = { 1294, 1024 }; // OK constexpr pixel origin; // error: initializer missing — end example]

9.5.4 Coroutine definitions [dcl.fct.def.coroutine]

An implementation may need to allocate additional storage for a coroutine.
This storage is known as the coroutine state and is obtained by calling a non-array allocation function ([basic.stc.dynamic.allocation]).
The allocation function's name is looked up by searching for it in the scope of the promise type.
  • If the search finds any declarations, overload resolution is performed on a function call created by assembling an argument list.
    The first argument is the amount of space requested, and is a prvalue of type std​::​size_t.
    The lvalues are the successive arguments.
    If no viable function is found ([over.match.viable]), overload resolution is performed again on a function call created by passing just the amount of space required as a prvalue of type std​::​size_t.
  • If the search finds no declarations, a search is performed in the global scope.
    Overload resolution is performed on a function call created by passing the amount of space required as a prvalue of type std​::​size_t.
If a search for the name get_return_object_on_allocation_failure in the scope of the promise type ([class.member.lookup]) finds any declarations, then the result of a call to an allocation function used to obtain storage for the coroutine state is assumed to return nullptr if it fails to obtain storage, and if a global allocation function is selected, the ​::​operator new(size_t, nothrow_t) form is used.
The allocation function used in this case shall have a non-throwing noexcept-specifier.
If the allocation function returns nullptr, the coroutine transfers control to the caller of the coroutine and the return value is obtained by a call to T​::​get_return_object_on_allocation_failure(), where T is the promise type.
[Example 2: #include <iostream> #include <coroutine> // ​::​operator new(size_t, nothrow_t) will be used if allocation is needed struct generator { struct promise_type; using handle = std::coroutine_handle<promise_type>; struct promise_type { int current_value; static auto get_return_object_on_allocation_failure() { return generator{nullptr}; } auto get_return_object() { return generator{handle::from_promise(*this)}; } auto initial_suspend() { return std::suspend_always{}; } auto final_suspend() noexcept { return std::suspend_always{}; } void unhandled_exception() { std::terminate(); } void return_void() {} auto yield_value(int value) { current_value = value; return std::suspend_always{}; } }; bool move_next() { return coro ? (coro.resume(), !coro.done()) : false; } int current_value() { return coro.promise().current_value; } generator(generator const&) = delete; generator(generator && rhs) : coro(rhs.coro) { rhs.coro = nullptr; } ~generator() { if (coro) coro.destroy(); } private: generator(handle h) : coro(h) {} handle coro; }; generator f() { co_yield 1; co_yield 2; } int main() { auto g = f(); while (g.move_next()) std::cout << g.current_value() << std::endl; } — end example]
The coroutine state is destroyed when control flows off the end of the coroutine or the destroy member function ([coroutine.handle.resumption]) of a coroutine handle ([coroutine.handle]) that refers to the coroutine is invoked.
In the latter case, control in the coroutine is considered to be transferred out of the function ([stmt.dcl]).
The storage for the coroutine state is released by calling a non-array deallocation function ([basic.stc.dynamic.deallocation]).
If destroy is called for a coroutine that is not suspended, the program has undefined behavior.
The deallocation function's name is looked up by searching for it in the scope of the promise type.
If nothing is found, a search is performed in the global scope.
If both a usual deallocation function with only a pointer parameter and a usual deallocation function with both a pointer parameter and a size parameter are found, then the selected deallocation function shall be the one with two parameters.
Otherwise, the selected deallocation function shall be the function with one parameter.
If no usual deallocation function is found, the program is ill-formed.
The selected deallocation function shall be called with the address of the block of storage to be reclaimed as its first argument.
If a deallocation function with a parameter of type std​::​size_t is used, the size of the block is passed as the corresponding argument.
During constant evaluation all calls to allocation and deallocation functions found to provide the coroutine state are elided.

17.12 Coroutines [support.coroutine]

17.12.1 General [support.coroutine.general]

The header <coroutine> defines several types providing compile and run-time support for coroutines in a C++ program.

17.12.2 Header <coroutine> synopsis [coroutine.syn]

// all freestanding #include <compare> // see [compare.syn] namespace std { // [coroutine.traits], coroutine traits template<class R, class... ArgTypes> struct coroutine_traits; // [coroutine.handle], coroutine handle template<class Promise = void> struct coroutine_handle; // [coroutine.handle.compare], comparison operators constexpr bool operator==(coroutine_handle<> x, coroutine_handle<> y) noexcept; constexpr strong_ordering operator<=>(coroutine_handle<> x, coroutine_handle<> y) noexcept; // [coroutine.handle.hash], hash support template<class T> struct hash; template<class P> struct hash<coroutine_handle<P>>; // [coroutine.noop], no-op coroutines struct noop_coroutine_promise; template<> struct coroutine_handle<noop_coroutine_promise>; using noop_coroutine_handle = coroutine_handle<noop_coroutine_promise>; constexpr noop_coroutine_handle noop_coroutine() noexcept; // [coroutine.trivial.awaitables], trivial awaitables struct suspend_never; struct suspend_always; }

17.12.3 Coroutine traits [coroutine.traits]

17.12.3.1 General [coroutine.traits.general]

Subclause [coroutine.traits] defines requirements on classes representing coroutine traits, and defines the class template coroutine_traits that meets those requirements.

17.12.3.2 Class template coroutine_traits [coroutine.traits.primary]

The header <coroutine> defines the primary template coroutine_traits such that if ArgTypes is a parameter pack of types and if the qualified-id R​::​promise_type is valid and denotes a type ([temp.deduct]), then coroutine_traits<R, ArgTypes...> has the following publicly accessible member: using promise_type = typename R::promise_type;
Otherwise, coroutine_traits<R, ArgTypes...> has no members.
Program-defined specializations of this template shall define a publicly accessible nested type named promise_type.

17.12.4 Class template coroutine_handle [coroutine.handle]

17.12.4.1 General [coroutine.handle.general]

namespace std { template<> struct coroutine_handle<void> { // [coroutine.handle.con], construct/reset constexpr coroutine_handle() noexcept; constexpr coroutine_handle(nullptr_t) noexcept; constexpr coroutine_handle& operator=(nullptr_t) noexcept; // [coroutine.handle.export.import], export/import constexpr void* address() const noexcept; static constexpr coroutine_handle from_address(void* addr); // [coroutine.handle.observers], observers constexpr explicit operator bool() const noexcept; constexpr bool done() const; // [coroutine.handle.resumption], resumption constexpr void operator()() const; constexpr void resume() const; constexpr void destroy() const; private: void* ptr; // exposition only }; template<class Promise> struct coroutine_handle { // [coroutine.handle.con], construct/reset constexpr coroutine_handle() noexcept; constexpr coroutine_handle(nullptr_t) noexcept; static constexpr coroutine_handle from_promise(Promise&); constexpr coroutine_handle& operator=(nullptr_t) noexcept; // [coroutine.handle.export.import], export/import constexpr void* address() const noexcept; static constexpr coroutine_handle from_address(void* addr); // [coroutine.handle.conv], conversion constexpr operator coroutine_handle<>() const noexcept; // [coroutine.handle.observers], observers constexpr explicit operator bool() const noexcept; constexpr bool done() const; // [coroutine.handle.resumption], resumption constexpr void operator()() const; constexpr void resume() const; constexpr void destroy() const; // [coroutine.handle.promise], promise access constexpr Promise& promise() const; private: void* ptr; // exposition only }; }
An object of type coroutine_handle<T> is called a coroutine handle and can be used to refer to a suspended or executing coroutine.
A coroutine_handle object whose member address() returns a null pointer value does not refer to any coroutine.
Two coroutine_handle objects refer to the same coroutine if and only if their member address() returns the same non-null value.
If a program declares an explicit or partial specialization of coroutine_handle, the behavior is undefined.

17.12.4.2 Construct/reset [coroutine.handle.con]

constexpr coroutine_handle() noexcept; constexpr coroutine_handle(nullptr_t) noexcept;
Postconditions: address() == nullptr.
static constexpr coroutine_handle from_promise(Promise& p);
Preconditions: p is a reference to a promise object of a coroutine.
Postconditions: addressof(h.promise()) == addressof(p).
Returns: A coroutine handle h referring to the coroutine.
constexpr coroutine_handle& operator=(nullptr_t) noexcept;
Postconditions: address() == nullptr.
Returns: *this.

17.12.4.3 Conversion [coroutine.handle.conv]

constexpr operator coroutine_handle<>() const noexcept;
Effects: Equivalent to: return coroutine_handle<>​::​from_address(address());

17.12.4.4 Export/import [coroutine.handle.export.import]

constexpr void* address() const noexcept;
Returns: ptr.
static constexpr coroutine_handle<> coroutine_handle<>::from_address(void* addr);
Preconditions: addr was obtained via a prior call to address on an object whose type is a specialization of coroutine_handle.
Postconditions: from_address(address()) == *this.
static constexpr coroutine_handle<Promise> coroutine_handle<Promise>::from_address(void* addr);
Preconditions: addr was obtained via a prior call to address on an object of type cv coroutine_handle<Promise>.
Postconditions: from_address(address()) == *this.

17.12.4.5 Observers [coroutine.handle.observers]

constexpr explicit operator bool() const noexcept;
Returns: address() != nullptr.
constexpr bool done() const;
Preconditions: *this refers to a suspended coroutine.
Returns: true if the coroutine is suspended at its final suspend point, otherwise false.

17.12.4.6 Resumption [coroutine.handle.resumption]

Resuming a coroutine via resume, operator(), or destroy on an execution agent other than the one on which it was suspended has implementation-defined behavior unless each execution agent either is an instance of std​::​thread or std​::​jthread, or is the thread that executes main.
[Note 1: 
A coroutine that is resumed on a different execution agent should avoid relying on consistent thread identity throughout, such as holding a mutex object across a suspend point.
— end note]
[Note 2: 
A concurrent resumption of the coroutine can result in a data race.
— end note]
constexpr void operator()() const; constexpr void resume() const;
Preconditions: *this refers to a suspended coroutine.
The coroutine is not suspended at its final suspend point.
Effects: Resumes the execution of the coroutine.
constexpr void destroy() const;
Preconditions: *this refers to a suspended coroutine.
Effects: Destroys the coroutine ([dcl.fct.def.coroutine]).

17.12.4.7 Promise access [coroutine.handle.promise]

constexpr Promise& promise() const;
Preconditions: *this refers to a coroutine.
Returns: A reference to the promise of the coroutine.

17.12.4.8 Comparison operators [coroutine.handle.compare]

constexpr bool operator==(coroutine_handle<> x, coroutine_handle<> y) noexcept;
Returns: x.address() == y.address().
constexpr strong_ordering operator<=>(coroutine_handle<> x, coroutine_handle<> y) noexcept;
Returns: compare_three_way()(x.address(), y.address()).

17.12.4.9 Hash support [coroutine.handle.hash]

template<class P> struct hash<coroutine_handle<P>>;
The specialization is enabled ([unord.hash]).

17.12.5 No-op coroutines [coroutine.noop]

17.12.5.1 Class noop_coroutine_promise [coroutine.promise.noop]

struct noop_coroutine_promise {};
The class noop_coroutine_promise defines the promise type for the coroutine referred to by noop_coroutine_handle ([coroutine.syn]).

17.12.5.2 Class coroutine_handle<noop_coroutine_promise> [coroutine.handle.noop]

17.12.5.2.1 General [coroutine.handle.noop.general]

namespace std { template<> struct coroutine_handle<noop_coroutine_promise> { // [coroutine.handle.noop.conv], conversion constexpr operator coroutine_handle<>() const noexcept; // [coroutine.handle.noop.observers], observers constexpr explicit operator bool() const noexcept; constexpr bool done() const noexcept; // [coroutine.handle.noop.resumption], resumption constexpr void operator()() const noexcept; constexpr void resume() const noexcept; constexpr void destroy() const noexcept; // [coroutine.handle.noop.promise], promise access constexpr noop_coroutine_promise& promise() const noexcept; // [coroutine.handle.noop.address], address constexpr void* address() const noexcept; private: constexpr coroutine_handle(unspecified); void* ptr; // exposition only }; }

17.12.5.2.2 Conversion [coroutine.handle.noop.conv]

constexpr operator coroutine_handle<>() const noexcept;
Effects: Equivalent to: return coroutine_handle<>​::​from_address(address());

17.12.5.2.3 Observers [coroutine.handle.noop.observers]

constexpr explicit operator bool() const noexcept;
Returns: true.
constexpr bool done() const noexcept;
Returns: false.

17.12.5.2.4 Resumption [coroutine.handle.noop.resumption]

constexpr void operator()() const noexcept; constexpr void resume() const noexcept; constexpr void destroy() const noexcept;
Effects: None.
Remarks: If noop_coroutine_handle is converted to coroutine_handle<>, calls to operator(), resume and destroy on that handle will also have no observable effects.

17.12.5.2.5 Promise access [coroutine.handle.noop.promise]

constexpr noop_coroutine_promise& promise() const noexcept;
Returns: A reference to the promise object associated with this coroutine handle.

17.12.5.2.6 Address [coroutine.handle.noop.address]

constexpr void* address() const noexcept;
Returns: ptr.
Remarks: A noop_coroutine_handle's ptr is always a non-null pointer value.

17.12.5.3 Function noop_coroutine [coroutine.noop.coroutine]

constexpr noop_coroutine_handle noop_coroutine() noexcept;
Returns: A handle to a coroutine that has no observable effects when resumed or destroyed.
Remarks: A handle returned from noop_coroutine may or may not compare equal to a handle returned from another invocation of noop_coroutine.

17.12.6 Trivial awaitables [coroutine.trivial.awaitables]

namespace std { struct suspend_never { constexpr bool await_ready() const noexcept { return true; } constexpr void await_suspend(coroutine_handle<>) const noexcept {} constexpr void await_resume() const noexcept {} }; struct suspend_always { constexpr bool await_ready() const noexcept { return false; } constexpr void await_suspend(coroutine_handle<>) const noexcept {} constexpr void await_resume() const noexcept {} }; }
[Note 1: 
The types suspend_never and suspend_always can be used to indicate that an await-expression either never suspends or always suspends, and in either case does not produce a value.
— end note]

Feature test macros

15.11 Predefined macro names [cpp.predefined]

__cpp_constexpr_coroutines 2024??L

17.3.2 Header <version> synopsis [version.syn]

#define __cpp_lib_constexpr_coroutines 2024??L // also in <coroutine>