Document #: | P3530R0 |
Date: | 2024-12-17 |
Project: | Programming Language C++ |
Audience: |
EWG, Core, LWG |
Reply-to: |
Boleyn Su <boleyn.su@gmail.com> Gašper Ažman <gasper.azman@gmail.com> |
With current standard, it is impossible to safely read uninitialized memory. Thus, data structures or algorithms relying on reading uninitialized memory cannot be implemented in standard C++.
We therefore propose to add a new intrinsic to allow reading uninitialized memory.
The adoption of erroneous behavior [P2795R5] defined the behaviour of reading uninitialized memory to a greater extent than in C++23.
While this is a good thing in the vast majority of cases, it breaks rare legitimate usecases for algorithms that rely on reading uninitialized memory for their performance guarantees (cited below), with no recourse.
We need an intrinsic to mark reading uninitialized memory as intended, not erroneous.
There is a well-known data structure to present sparse sets [Sparse] as shown below.
template <int n>
class SparseSet {
// Invariants:
// - index_of[elements[i]] == i for all 0<=i<size
// - elements[0..size-1] contains all elements in the set.
// These invariants guarantee the correctness.
int elements[n];
int index_of[n];
int size;
public:
() : size(0) {} // we do not initialize elements and index_of
SparseSetvoid clear() { size = 0; }
bool find(int x) {
// assume x in [0, n)
// There is a chance we read index_of[x] before writing to it.
int i = index_of[x];
// The algorithm is correct for an arbitrary value of index_of[x],
// as long as the read itself is not undefined behavior,
// because the invariants guarantee x is in the set if and only if
// the below condition holds.
return 0 <= i && i < size && elements[i] == x;
}
void insert(int x) {
// assume x in [0, n)
if (find(x)) { return; }
// The invariants are maintained.
[x] = size;
index_of[size] = x;
elements++;
size}
void remove(int x) {
// assume x in [0, n)
if (!find(x)) { return; }
// The invariants are maintained.
--;
sizeint i = index_of[x];
[i] = elements[size];
elements[elements[size]] = i;
index_of}
};
The read index_of[x]
in find
above may read uninitialized
memory, which C++26 makes erroneous, without recourse.
It is impossible to implement a data structure supporting the same set of operations with a worst-case time complexity of O(1) for each operation without relying on reading uninitialized memory, to the author’s best knowledge.
The LLVM project already supports
freeze
instruction which is the
lower level equivalent of our proposal [Freeze].
We propose to add a magic function
template <typename T>
::read_maybe_uninitialized(const T& v) noexcept; T std
where T
must be an
implicit-lifetime type.
read_maybe_uninitialized(v)
returns the value of v
if
v
is initialized.
Otherwise, it returns an unspecified value. (This is how LLVM’s
freeze
works.)
This approach is modeled after
start_lifetime_as_array
, but also
marks the storage as having been initialized to an unspecified but valid
value.
template <typename T>
* start_lifetime_as_array_uninitialized(void* v, size_t n) noexcept; T
Implicitly creates an array with element type
T
and length
n
, and the storage is considered
initialized even if no write occured to the storage previously.
The rest as
start_lifetime_as_array
.
A companion function
template <typename T>
* start_lifetime_as_uninitialized(void*); T
is also provided, with the same semantics.
Will be provided when the alternative is chosen.