Program termination in the presence of multiple threads

Niall Douglas, ned Productions Ltd., Ireland

Jens Gustedt, INRIA and ICube, France

2026-07-05

target

integration into IS ISO/IEC 9899:202y

document history

document number	date	comment
n3917	202607	this paper

License

Creative Commons BY 4.0

Liaison

WG22
Austin Group

Abstract

In many circumstances program termination in the presence of multiple threads (in the C11 thread model) has undefined behavior. The aim of this paper is to replace that undefined behavior by a set of choices that cover current practice in the field. Thereby we provide application programmers with a clearer view of the problem spots that they should take care of when programming multi-threaded programs in a portable way.

1 Introduction

When C introduced threads in C11 many interactions of threads with other components of a program execution had been left open. This concerns in particular the interaction with signals, see N3872, and with program termination. For the latter, up to today the behavior of an execution that has multiple executing threads has undefined behavior, simply by omission of a definition for the required behavior.

Clearly, here undefined behavior acts an extension point to the standard and existing implementations cope relatively well with the situation. In fact, implementations widely agree how termination of threads under program termination is managed: threads other than the terminating one (in the following called orphaned threads) continue execution relative far into the termination and are then cut off, when control is returned to the host environment.

The only difference between implementations that we observed seems to be whether or not the threads execute thread-specific storage destructors. The intent of the standard is expressed in 7.30.6.1 p4 for the tss_create function:

4 Destructors associated with thread-specific storage are not invoked at program termination.

This text seems to be relatively clear for orphaned threads since these are still executing their usual thread function. And indeed, since tss_delete might be called by atexit or at_quick_exit handlers, it is important that orphaned threads don’t try to access the tss_t mechanism at all.

For the terminating thread there seems to be some difference in interpretation. Some implementations execute thread-specific destructors if main returns by normal function return. The idea here probably being that first the thread specific code terminates normally with a return, and then the special case for main that this is also thread termination kicks in. Other implementations also execute thread-specific destructors when exit is called directly.

Obviously, under such a model of termination errors can happen. For example:

Information that is written to objects by orphaned threads may get lost.
Information that is written to files may be lost or mangled.
Race conditions between different orphaned threads or with cleanup code of the terminating thread may occur.
If orphaned threads hold locks or are in the middle of some critical operations, deadlocks or crashes may result.

But all these possible mishaps are not very specific to program termination. They occur equally in other situations and as such need not to be specified further.

The aim of this paper is to fill the gap between the current specification and the existing practice by providing a relatively lose definition for program termination in a hosted environment. Thereby we move the problem from undefined behavior to unspecified behavior, namely we provide requirements for a program execution and termination that chooses opportunistically between different possible behaviors.

2 Proposed model

We propose a relatively simple model for the thread termination under program termination:

There is one dedicated thread, the terminating thread, that triggers (by a calling a terminating function) or is triggered (by receiving a terminating signal) for program termination. That thread continues execution, but by running the cleanup code as required for the different models of program termination. (That is, runs different forms of application handlers, closes files …) At the end this threads switches back control to the host environment and stops.
Any other thread that is running when a program termination is triggered is orphaned. It continues its normal execution until it is stopped by the host environment. How long this continuation into the program termination goes is unspecified, the only portable assumption is that each orphaned thread has stopped when the host environment regains control.
No specific synchronization is foreseen between orphaned threads or with the termination thread. If no other precaution is taken:
- Side effects that do not happen before the termination event are or are not visible to other threads or to the host environment. This concerns for example storing into objects and writing to files.
- Threads that hold critical resources such as locks on mutexes, lockful atomics, IO devices or unique file names may cause serious damage to the program execution (race conditions, deadlocks, crashes) or to the entire system (data loss, DoS attacks, overheating, hardware failure).
- In general, thread-specific storage destructors may or may not be invoked.
- More specifically, thread-specific storage destructors are not invoked for _Exit and quick_exit.

3 Existing practice

In the following we visit the major implementations we found that seem to have implementations of C11 threads or that come close to it.

Not covered are mostly freestanding environments, in particular embedded devices, which in general do not have multiple threads or for which program termination is not defined by the C standard.

3.1 POSIX

For the specification of POSIX see

The Open Group Base Specifications Issue 8

POSIX has an additional _exit function that is assumed to be functionally equivalent to C’s _Exit.

_Exit, _exit — terminate a process

POSIX does not prescribe much about termination of orphaned threads, only that _Exit (resp. _exit) and quick_exit do not invoke their thread-specific storage destructors.

Note also that POSIX in its latest version basically requires that C11 threads are provided.

3.1.1 Linux

Linux is an open-source POSIX system with a large distribution that runs on an abundant number of architectures, on smart devices, phones, network devices, personal computers, mainframes and petascale clusters. On a functioning system, in addition to the OS kernel one of several C libraries and runtime environments are provided.

The Linux kernel has three different groups of system calls to terminate threads. Note that normal thread termination as described for thrd_exit has to wrap such a system call to ensure for example that side effects are synchronized and that thread-specific storage destructors are invoked.

SYS_tkill and SYS_tgkill with signal SIGABRT terminate the current thread or a group of threads abnormally
SYS_exit terminates the current thread normally
similarly SYS_exit_group terminates all current threads of the program execution normally.

All these run without invoking application cleanup handlers of any sort; these are run by the function interfaces that are provided by the C library that is in use.

For a description of the different levels of program termination under Linux see for example

https://linuxvox.com/blog/what-is-the-difference-between-exit-and-exit-group/.

3.1.1.1 The Musl C library

Musl is an open source C library that is POSIX conforming for platforms with the Linux OS kernel. It is quite light-weight and mostly used for small Linux devices with storage constraints.

Their strategies for terminating threads under program termination are the following:

abort and a SIGABRT signal do not interact with other threads and go as fast as possible into a system call (SYS_tkill) that terminates the current thread and (because the signal is SIGABRT) its group.
_Exit first exits the threads of the current execution (SYS_exit_group) with the same exit code as the function argument. If that still does not end the terminating thread, SYS_exit is called in an infinite loop.
quick_exit first runs the at_quick_exit handlers and only then terminates other threads by calling _Exit
exit proceeds as follows
1. It locks an atomic spinlock to avoid concurrent calls to exit
2. It calls the atexit handlers.
3. It launches cleanup handlers for dynamically linked libraries.
4. It terminates all pending IO.
5. It calls _Exit.

So all these functions have the property that orphaned threads continue running while program termination has been started. These orphaned threads are only terminated eventually shortly before control is handed back to the host environment.

3.1.1.2 The Gnu C library

The Gnu C library is the traditional open source C library that is used for Linux desktop and server systems. Its scope is larger than that, since it implements a POSIX’ conforming C library for general platforms and is also used for emulated POSIX user spaces on top of other OSses. There is a general interface and two specialized ones for the Linux OS kernel and the Hurd kernel. The approach is mostly similar as for Musl, only that for the Hurd OS, _Exit issues a system call named task_terminate that seems to have similar properties as the Linux call SYS_exit_group. For the general less specific POSIX interface all termination falls back to the SIGABRT signal, instead.

3.1.2 FreeBSD, netBSD, openBSD

These forks of the historical BSD system are integrated platforms that provide an OS kernel and a C library. The system call that terminates a program execution is the _exit interface of POSIX. No particular handling of orphaned threads is documented for that call.

The implementation of the other terminating functions follows the same strategy as Musl, that is no particular precaution is made for orphaned threads. These continue into program termination until _exit is called.

3.1.2.1 Bionic (Android)

A modified Linux kernel an the Bionic C library are the components of interest in our context of the Android system for phones and other hand-held devices. Bionic is in large parts a fork of the FreeBSD C library and has a behavior for program termination that follows theirs.

3.1.3 Minix 3

This is a POSIX compatible OS that was originally conceived for educational purposes, but has found also its way into the Intel Management Engine for the low-level control of processor chipsets. It does not provide the C11 thread interfaces, but supports POSIX threads.

The userspace C library is similar to those of the BSD family. _exit and _Exit emit a system call PM_EXIT for termination. If that system call returns, a call to an invalid function pointer (called suicide) is issued, after which (if the program still executes) an infinite loop is entered.

For other terminating functions, the strategies are as described for the other C libraries operating on POSIX; in general orphaned threads are terminated late, no special cleanup for them is performed by any of these functions.

3.2 AIX

AIX is IBM’s proprietary UNIX OS. Its thread termination policies are explained here

Terminating threads

In general, the same model for terminating orphaned threads as described for the POSIX platforms above seems to be followed.

3.3 z/OS

This is IBM’s family of OS for their Z series of mainframes. Among these, only z/OS UNIX (a POSIX implementation) seems to support multi-threading. Since this is not an open-source system, our knowledge about this system is restricted to the documentation that is accessible online, for example

https://www.ibm.com/docs/en/zos/3.2.0?topic=whdt-thread-termination.

This documentation seems to indicate that for program termination similar strategies as above are applied for exit, _exit and in extrapolation for _Exit.

In particular, for exit, orphaned threads continue their execution while the terminating thread performs the cleanup, and for _exit no thread-specific storage destructors are called.

This system does not seem to implement quick_exit.

3.4 MS Windows

There are two OS functions ExitProcess and TerminateProcess that unconditionally cause a program execution to terminate. Both stop execution of all threads before returning control to the host environment. Besides MS Windows specific properties, the difference between the two seems that ExitProcess indicates normal program termination whereas TerminateProcess indicates abnormal program termination. It seems that _Exit (and a similar an _exit function) directly builds on ExitProcess to terminate the program execution. It does not call destructors of thread-local objects.

Documentation for the relationship with the other C interfaces is scarce, but it seems that exit calls ExitProcess after running the atexit handlers. So orphaned threads continue execution while the handlers run. But different from all other implementations discussed so far, it seems that exit calls thread-local destructors, but for the terminating thread only. (The line between thread-local destructors, which seems to refer to C++ destructors, and C’s thread-specific storage destructor functions, tss_dtor_t, is not clear.)

Note that, on some other implementations the thread-specific storage destructors may be called if main returns regularly. So there the model is that a regular return from main first exits the thread and then does the thread exit cleanup; exit is then thought to kick in after that cleanup. MS Windows does not seem to make that conceptual difference.

The specificity of exit versus quick_exit seems not to be documented.

The behavior for abort is not standard conforming, see Microsoft’s own documentation. In fact, if no signal handler for SIGABRT is registered, abort calls _exit and thus reports back to the host environment via ExitProcess as normal instead of abnormal program termination. To emulate standard conforming behavior, applications have to register a signal handler for SIGABRT that calls TerminateProcess.

4 Proposed wording

4.1 Legend

Deletions in the shown standard text are as shown ~~here~~, additions, as shown here. These may be rendered differently according to the style in which the document is shown by your browser but should always be well distinguishable. In the style provided there are two visual distinctions:

A high contrast color palette (Okabe and Ito) namely using colors black, orange, teal green and light yellow
normal text, ~~strike through~~, underlining and typewriter font.

Close to each other proposed changes resemble ~~like~~ this.

4.2 Changes to 5.2.2.3.4 “Program termination”, hosted environment

5.2.2.3.4 Program termination

1 If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;⁶⁾ reaching the } that terminates the main function returns a value of ~~0~~ EXIT_SUCCESS. Otherwise, terminating the last thread of execution (either by returning from the thread function or by calling thrd_exit) behaves as if the program called the exit function with the status EXIT_SUCCESS at thread termination time; all side effects of the whole program execution are then visible to this last thread. If the return type is not compatible with int, the termination status returned to the host environment is unspecified.

2 Additionally, normal program termination is the effect of calls to the functions exit, quick_exit, or _Exit; abnormal program termination is the effect of the receipt of the signal SIGABRT, of a call abort() or raise(SIGABRT).^ABRT) Semantics of these forms of program termination are defined in the respective clauses. If any, other implementation-defined functions and signals that have the effect of normal or abnormal program termination are documented by the implementation. Collectively the functions that have such an effect are called termination functions, signals that have such an effect are called termination signals.

^ABRT) Specific rules apply if SIGABRT is handled by a signal handler.

3 An evaluation E in a thread TT terminates the program execution if it is a call to one of the termination functions^EXIT) or if it is interrupted by a termination signal; TT is the terminating thread. TT continues execution by performing the respective cleanup operations^CLEAN) for the specific function call or signal (if any) and then returns control to the host environment; other than the cleanup functionalities required in this document, other implementation-specific functionalities may be executed by TT under program termination. All side effects that happen before E (including those of threads that are known to have terminated execution before E) are visible to the cleanup operations and to the host environment. If the termination happens because E is interrupted by a signal, it is unspecified whether side effects that are not known to have happened before E and that are not otherwise synchronized are visible; specific synchronization semantics concerning signal handlers are described later in this document.

^EXIT) This includes the events described previously that are equivalent to calling exit.

^CLEAN) Such cleanup operations are for example invocations of signal handlers that are bound to a specific signal, atexit and at_quick_exit handlers, and the completion of input/output operations.

4 Other threads that are not known to have terminated before E, if any, are orphaned threads. An orphaned thread OT continues execution for an unspecified amount of time and is then terminated with an implementation-specific mechanism

after all evaluations F in OT that happen before E

before the terminating thread TT returns control to the host environment.

A side effect of OT that is not the termination of OT and that is not synchronized with an evaluation in TT is aberrant; it is unspecified whether any aberrant side effect is visible to the host environment or not. The implementation-specific mechanism that is used to terminate OT causes no other visible side effect than the termination and, possibly, the invocation of thread-specific storage destructors; whether or not such thread-specific storage destructors are invoked is unspecified in general, but restricted for some of the terminating functions (7.25.5).

Recommended practice

5 It is unspecified, how far orphaned threads continue into program termination. To avoid data races and deadlocks the following are recommended.

Applications avoid program termination while there are multiple active threads, and thus avoid the presence of orphaned threads. This can be achieved in different ways, for example:

By having main join all other threads by means of thrd_join. The program execution then terminates normally when returning from main.

By detaching all threads and terminating the thread of main by calling thrd_exit instead of returning. The program execution then terminates normally with the last thread.

Applications use as few shared objects in atexit or at_quick_exit handlers and in thread-specific storage destructors as possible.

Orphaned threads avoid writing and reading from streams.

Orphaned threads avoid using tss_set and tss_get on keys that are deleted in atexit or at_quick_exit handlers.

Applications ensure that all locks on mutexes have been released before normal termination of multi-threaded programs.

If an orphaned thread is

waiting to lock a mutex,

waiting for a condition variable,

joining another thread

yielding execution,

or is calling an input or output function,

the implementation ensures that either the corresponding function never returns to its caller or that the return value indicates that program termination is in progress, if possible.

Forward references: Program semantics (5.2.2.4), Multi-threaded executions and data races (5.2.2.5), Signal handling (7.14), Common definitions (7.22), The atomic_signal_fence function (7.17.4.3), Lock-free atomics (7.17.5), Input/Output (7.24), Communication with the environment (7.25.5), Threads (7.30)

4.3 Changes to 7.25.5.4 `exit`

Add a new paragraph

5’ NOTE It is unspecified, how far orphaned threads continue into the program termination. Thus, if during program termination an orphaned thread and an atexit handler access the same non-atomic object and one of them modifies the object, a data race may occur. Also, if an atexit handler attempts to lock a mutex that is held by an orphaned thread, program termination can be delayed until the lock function returns, if ever.

4.4 Changes to 7.25.5.5 `_Exit`

Add a new paragraph

2’ Threads terminated by a call to _Exit do not invoke their thread-specific storage destructors (7.30.6).

4.5 Changes to 7.25.5.4 `quick_exit`

Add a new paragraph

5’ Threads terminated by a call to quick_exit do not invoke their thread-specific storage destructors (7.30.6).

5” NOTE It is unspecified, how far orphaned threads continue into the program termination. Thus, if during program termination an orphaned thread and an at_quick_exit handler access the same non-atomic object and one of them modifies the object, a data race may occur. Also, if an at_quick_exit handler attempts to lock a mutex that is held by an orphaned thread, program termination can be delayed until the lock function returns, if ever.

Acknowledgments

Many thanks to Rajan Bhakta for discussions.

target

document history

License

Liaison

Abstract

1 Introduction

2 Proposed model

3 Existing practice

3.1 POSIX

3.1.1 Linux

3.1.1.1 The Musl C library

3.1.1.2 The Gnu C library

3.1.2 FreeBSD, netBSD, openBSD

3.1.2.1 Bionic (Android)

3.1.3 Minix 3

3.2 AIX

3.3 z/OS

3.4 MS Windows

4 Proposed wording

4.1 Legend

4.2 Changes to 5.2.2.3.4 “Program termination”, hosted environment

4.3 Changes to 7.25.5.4 exit

4.4 Changes to 7.25.5.5 _Exit

4.5 Changes to 7.25.5.4 quick_exit

Acknowledgments

4.3 Changes to 7.25.5.4 `exit`

4.4 Changes to 7.25.5.5 `_Exit`

4.5 Changes to 7.25.5.4 `quick_exit`