2026-07-05
integration into IS ISO/IEC 9899:202y
| document number | date | comment |
|---|---|---|
| n3917 | 202607 | this paper |
In many circumstances program termination in the presence of multiple threads (in the C11 thread model) has undefined behavior. The aim of this paper is to replace that undefined behavior by a set of choices that cover current practice in the field. Thereby we provide application programmers with a clearer view of the problem spots that they should take care of when programming multi-threaded programs in a portable way.
When C introduced threads in C11 many interactions of threads with other components of a program execution had been left open. This concerns in particular the interaction with signals, see N3872, and with program termination. For the latter, up to today the behavior of an execution that has multiple executing threads has undefined behavior, simply by omission of a definition for the required behavior.
Clearly, here undefined behavior acts an extension point to the standard and existing implementations cope relatively well with the situation. In fact, implementations widely agree how termination of threads under program termination is managed: threads other than the terminating one (in the following called orphaned threads) continue execution relative far into the termination and are then cut off, when control is returned to the host environment.
The only difference between implementations that we observed seems to
be whether or not the threads execute thread-specific storage
destructors. The intent of the standard is expressed in 7.30.6.1 p4 for
the tss_create function:
4 Destructors associated with thread-specific storage are not invoked at program termination.
This text seems to be relatively clear for orphaned threads since
these are still executing their usual thread function. And indeed, since
tss_delete might be called by atexit or at_quick_exit handlers, it is important that
orphaned threads don’t try to access the tss_t mechanism at
all.
For the terminating thread there seems to be some difference in
interpretation. Some implementations execute thread-specific destructors
if main returns by normal function
return. The idea here probably being that first the thread specific code
terminates normally with a return, and then the
special case for main that this is
also thread termination kicks in. Other implementations also execute
thread-specific destructors when exit
is called directly.
Obviously, under such a model of termination errors can happen. For example:
But all these possible mishaps are not very specific to program termination. They occur equally in other situations and as such need not to be specified further.
The aim of this paper is to fill the gap between the current specification and the existing practice by providing a relatively lose definition for program termination in a hosted environment. Thereby we move the problem from undefined behavior to unspecified behavior, namely we provide requirements for a program execution and termination that chooses opportunistically between different possible behaviors.
We propose a relatively simple model for the thread termination under program termination:
There is one dedicated thread, the terminating thread, that triggers (by a calling a terminating function) or is triggered (by receiving a terminating signal) for program termination. That thread continues execution, but by running the cleanup code as required for the different models of program termination. (That is, runs different forms of application handlers, closes files …) At the end this threads switches back control to the host environment and stops.
Any other thread that is running when a program termination is triggered is orphaned. It continues its normal execution until it is stopped by the host environment. How long this continuation into the program termination goes is unspecified, the only portable assumption is that each orphaned thread has stopped when the host environment regains control.
No specific synchronization is foreseen between orphaned threads or with the termination thread. If no other precaution is taken:
_Exit and quick_exit.In the following we visit the major implementations we found that seem to have implementations of C11 threads or that come close to it.
Not covered are mostly freestanding environments, in particular embedded devices, which in general do not have multiple threads or for which program termination is not defined by the C standard.
For the specification of POSIX see
POSIX has an additional _exit
function that is assumed to be functionally equivalent to C’s _Exit.
POSIX does not prescribe much about termination of orphaned threads,
only that _Exit (resp. _exit) and quick_exit do not invoke their
thread-specific storage destructors.
Note also that POSIX in its latest version basically requires that C11 threads are provided.
Linux is an open-source POSIX system with a large distribution that runs on an abundant number of architectures, on smart devices, phones, network devices, personal computers, mainframes and petascale clusters. On a functioning system, in addition to the OS kernel one of several C libraries and runtime environments are provided.
The Linux kernel has three different groups of system calls to
terminate threads. Note that normal thread termination as described for
thrd_exit has to wrap such a system
call to ensure for example that side effects are synchronized and that
thread-specific storage destructors are invoked.
SYS_tkill and SYS_tgkill with signal SIGABRT terminate the current thread or a
group of threads abnormallySYS_exit terminates the current
thread normallySYS_exit_group
terminates all current threads of the program execution normally.All these run without invoking application cleanup handlers of any sort; these are run by the function interfaces that are provided by the C library that is in use.
For a description of the different levels of program termination under Linux see for example
https://linuxvox.com/blog/what-is-the-difference-between-exit-and-exit-group/.
Musl is an open source C library that is POSIX conforming for platforms with the Linux OS kernel. It is quite light-weight and mostly used for small Linux devices with storage constraints.
Their strategies for terminating threads under program termination are the following:
abort and a SIGABRT signal do not interact with other
threads and go as fast as possible into a system call (SYS_tkill) that terminates the current
thread and (because the signal is SIGABRT) its group._Exit first exits the threads of
the current execution (SYS_exit_group)
with the same exit code as the function argument. If that still does not
end the terminating thread, SYS_exit
is called in an infinite loop.quick_exit first runs the at_quick_exit handlers and only then
terminates other threads by calling _Exitexit proceeds as follows
exitatexit handlers._Exit.So all these functions have the property that orphaned threads continue running while program termination has been started. These orphaned threads are only terminated eventually shortly before control is handed back to the host environment.
The Gnu C library is the traditional open source C library that is
used for Linux desktop and server systems. Its scope is larger than
that, since it implements a POSIX’ conforming C library for general
platforms and is also used for emulated POSIX user spaces on top of
other OSses. There is a general interface and two specialized ones for
the Linux OS kernel and the Hurd kernel. The approach is mostly similar
as for Musl, only that for the Hurd OS, _Exit issues a system call named task_terminate that seems to have similar
properties as the Linux call SYS_exit_group. For the general less
specific POSIX interface all termination falls back to the SIGABRT signal, instead.
These forks of the historical BSD system are integrated platforms
that provide an OS kernel and a C library. The system call that
terminates a program execution is the _exit interface of POSIX. No particular
handling of orphaned threads is documented for that call.
The implementation of the other terminating functions follows the
same strategy as Musl, that is no particular precaution is made for
orphaned threads. These continue into program termination until _exit is called.
A modified Linux kernel an the Bionic C library are the components of interest in our context of the Android system for phones and other hand-held devices. Bionic is in large parts a fork of the FreeBSD C library and has a behavior for program termination that follows theirs.
This is a POSIX compatible OS that was originally conceived for educational purposes, but has found also its way into the Intel Management Engine for the low-level control of processor chipsets. It does not provide the C11 thread interfaces, but supports POSIX threads.
The userspace C library is similar to those of the BSD family. _exit and _Exit emit a system call PM_EXIT for termination. If that system call
returns, a call to an invalid function pointer (called suicide) is issued, after which (if the
program still executes) an infinite loop is entered.
For other terminating functions, the strategies are as described for the other C libraries operating on POSIX; in general orphaned threads are terminated late, no special cleanup for them is performed by any of these functions.
AIX is IBM’s proprietary UNIX OS. Its thread termination policies are explained here
In general, the same model for terminating orphaned threads as described for the POSIX platforms above seems to be followed.
This is IBM’s family of OS for their Z series of mainframes. Among these, only z/OS UNIX (a POSIX implementation) seems to support multi-threading. Since this is not an open-source system, our knowledge about this system is restricted to the documentation that is accessible online, for example
https://www.ibm.com/docs/en/zos/3.2.0?topic=whdt-thread-termination.
This documentation seems to indicate that for program termination
similar strategies as above are applied for exit, _exit and in extrapolation for _Exit.
In particular, for exit, orphaned
threads continue their execution while the terminating thread performs
the cleanup, and for _exit no
thread-specific storage destructors are called.
This system does not seem to implement quick_exit.
There are two OS functions ExitProcess and TerminateProcess that unconditionally cause
a program execution to terminate. Both stop execution of all threads
before returning control to the host environment. Besides MS Windows
specific properties, the difference between the two seems that ExitProcess indicates normal program
termination whereas TerminateProcess
indicates abnormal program termination. It seems that _Exit (and a similar an _exit function) directly builds on ExitProcess to terminate the program
execution. It does not call destructors of thread-local objects.
Documentation for the relationship with the other C interfaces is
scarce, but it seems that exit calls
ExitProcess after running the atexit handlers. So orphaned threads
continue execution while the handlers run. But different from all other
implementations discussed so far, it seems that exit calls thread-local destructors, but for
the terminating thread only. (The line between thread-local destructors,
which seems to refer to C++ destructors, and C’s thread-specific storage
destructor functions, tss_dtor_t, is not
clear.)
Note that, on some other implementations the thread-specific storage
destructors may be called if main
returns regularly. So there the model is that a regular return from
main first exits the thread and then
does the thread exit cleanup; exit is
then thought to kick in after that cleanup. MS Windows does not seem to
make that conceptual difference.
The specificity of exit versus
quick_exit seems not to be
documented.
The behavior for abort is not
standard conforming, see Microsoft’s
own documentation. In fact, if no signal handler for SIGABRT is registered, abort calls _exit and thus reports back to the host
environment via ExitProcess as normal
instead of abnormal program termination. To emulate standard conforming
behavior, applications have to register a signal handler for SIGABRT that calls TerminateProcess.
Deletions in the shown standard text are as shown here,
additions, as shown here. These may be rendered differently
according to the style in which the document is shown by your browser
but should always be well distinguishable. In the style provided there
are two visual distinctions:
Close to each other proposed changes resemble
like this.
5.2.2.3.4 Program termination
1 If the return type of the
mainfunction is a type compatible withint, a return from the initial call to the main function is equivalent to calling theexitfunction with the value returned by themainfunction as its argument;6) reaching the}that terminates themainfunction returns a value of0EXIT_SUCCESS. Otherwise, terminating the last thread of execution (either by returning from the thread function or by callingthrd_exit) behaves as if the program called theexitfunction with the statusEXIT_SUCCESSat thread termination time; all side effects of the whole program execution are then visible to this last thread. If the return type is not compatible withint, the termination status returned to the host environment is unspecified.
2 Additionally, normal program termination is the effect of calls to the functions
exit,quick_exit, or_Exit; abnormal program termination is the effect of the receipt of the signalSIGABRT, of a callabort()orraise(SIGABRT).ABRT) Semantics of these forms of program termination are defined in the respective clauses. If any, other implementation-defined functions and signals that have the effect of normal or abnormal program termination are documented by the implementation. Collectively the functions that have such an effect are called termination functions, signals that have such an effect are called termination signals.
ABRT) Specific rules apply if
SIGABRTis handled by a signal handler.
3 An evaluation
Ein a threadTTterminates the program execution if it is a call to one of the termination functionsEXIT) or if it is interrupted by a termination signal;TTis the terminating thread.TTcontinues execution by performing the respective cleanup operationsCLEAN) for the specific function call or signal (if any) and then returns control to the host environment; other than the cleanup functionalities required in this document, other implementation-specific functionalities may be executed byTTunder program termination. All side effects that happen beforeE(including those of threads that are known to have terminated execution beforeE) are visible to the cleanup operations and to the host environment. If the termination happens becauseEis interrupted by a signal, it is unspecified whether side effects that are not known to have happened beforeEand that are not otherwise synchronized are visible; specific synchronization semantics concerning signal handlers are described later in this document.
EXIT) This includes the events described previously that are equivalent to calling
exit.
CLEAN) Such cleanup operations are for example invocations of signal handlers that are bound to a specific signal,
atexitandat_quick_exithandlers, and the completion of input/output operations.
4 Other threads that are not known to have terminated before
E, if any, are orphaned threads. An orphaned threadOTcontinues execution for an unspecified amount of time and is then terminated with an implementation-specific mechanism
- after all evaluations
FinOTthat happen beforeE- before the terminating thread
TTreturns control to the host environment.
A side effect of
OTthat is not the termination ofOTand that is not synchronized with an evaluation inTTis aberrant; it is unspecified whether any aberrant side effect is visible to the host environment or not. The implementation-specific mechanism that is used to terminateOTcauses no other visible side effect than the termination and, possibly, the invocation of thread-specific storage destructors; whether or not such thread-specific storage destructors are invoked is unspecified in general, but restricted for some of the terminating functions (7.25.5).
Recommended practice
5 It is unspecified, how far orphaned threads continue into program termination. To avoid data races and deadlocks the following are recommended.
- Applications avoid program termination while there are multiple active threads, and thus avoid the presence of orphaned threads. This can be achieved in different ways, for example:
- By having
mainjoin all other threads by means ofthrd_join. The program execution then terminates normally when returning frommain.- By detaching all threads and terminating the thread of
mainby callingthrd_exitinstead of returning. The program execution then terminates normally with the last thread.
- Applications use as few shared objects in
atexitorat_quick_exithandlers and in thread-specific storage destructors as possible.- Orphaned threads avoid writing and reading from streams.
- Orphaned threads avoid using
tss_setandtss_geton keys that are deleted inatexitorat_quick_exithandlers.- Applications ensure that all locks on mutexes have been released before normal termination of multi-threaded programs.
- If an orphaned thread is
- waiting to lock a mutex,
- waiting for a condition variable,
- joining another thread
- yielding execution,
- or is calling an input or output function,
the implementation ensures that either the corresponding function never returns to its caller or that the return value indicates that program termination is in progress, if possible.
Forward references: Program semantics (5.2.2.4), Multi-threaded executions and data races (5.2.2.5), Signal handling (7.14), Common definitions (7.22), The
atomic_signal_fencefunction (7.17.4.3), Lock-free atomics (7.17.5), Input/Output (7.24), Communication with the environment (7.25.5), Threads (7.30)
exitAdd a new paragraph
5’ NOTE It is unspecified, how far orphaned threads continue into the program termination. Thus, if during program termination an orphaned thread and an
atexithandler access the same non-atomic object and one of them modifies the object, a data race may occur. Also, if anatexithandler attempts to lock a mutex that is held by an orphaned thread, program termination can be delayed until the lock function returns, if ever.
_ExitAdd a new paragraph
2’ Threads terminated by a call to
_Exitdo not invoke their thread-specific storage destructors (7.30.6).
quick_exitAdd a new paragraph
5’ Threads terminated by a call to
quick_exitdo not invoke their thread-specific storage destructors (7.30.6).
5” NOTE It is unspecified, how far orphaned threads continue into the program termination. Thus, if during program termination an orphaned thread and an
at_quick_exithandler access the same non-atomic object and one of them modifies the object, a data race may occur. Also, if anat_quick_exithandler attempts to lock a mutex that is held by an orphaned thread, program termination can be delayed until the lock function returns, if ever.
Many thanks to Rajan Bhakta for discussions.