Extensions for the Programming Language C++ to Support New Character Data Types
ISO/IEC JTC1 SC22 WG21 N1628
Date: July 16 2004
Many users of C++ need to manipulate Unicode character strings.
Unfortunately, there is no C++ standard means to do so.
The ISO C committee has addressed this issue extensively. We should adopt their work, but with those changes necessary for effective use within C++. See ISO C standard TR 19769 "New character types in C" as described in draft report ISO/IEC JTC1 SC22 WG14 N1040 "Extensions for the programming language C to support new character data types" at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1040.pdf.
Summary of WG14 N1040
The document WG14 N1040 provides motivation, macros for reporting ISO
10646 encoding, new typedefs for the 16-bit and 32-bit character types, character and string literals, mixed string concatenation, and four library functions.
Changes for C++
The document WG14 N1040 can be adopted with few changes. Specifically, they are:
Section 3 "The new typedefs"
Define char16_t to be a typedef to a distinct new type, with the
name _Char16_t that has the same size and representation as
[N1040 defined char16_t as a typedef to the type of
uint_least16_t, which make overloading on char16_t impossible.]
Define char32_t to be a typedef to a distinct new type, with the
name _Char32_t that has the same size and representation as
[N1040 defined char32_t as a typedef to the type of
uint_least32_t, which make overloading on char16_t impossible.]
New section 6.5 "The standard template and typedefs"
The standard library will define ...16 and ...32 typedefs, in
analogy to the w... typedefs, for
filebuf, streambuf, streampos, streamoff,
ios, istream, ostream,
fstream, ifstream, ofstream,
stringstream, istringstream, ostringstream,