7.6 Characters

Describes implementation-defined aspects of the Arm® C compiler and C library relating to characters, as required by the ISO C standard.

The number of bits in a byte (3.6).
The number of bits in a byte is 8.
The values of the members of the execution character set (5.2.1).
The values of the members of the execution character set are all the code points defined by ISO/IEC 10646.
The unique value of the member of the execution character set produced for each of the standard alphabetic escape sequences (5.2.2).
Character escape sequences have the following values in the execution character set:
Escape sequence Char value Description
\a 7 Attention (bell)
\b 8 Backspace
\t 9 Horizontal tab
\n 10 New line (line feed)
\v 11 Vertical tab
\f 12 Form feed
\r 13 Carriage return
The value of a char object into which has been stored any character other than a member of the basic execution character set (6.2.5).
The value of a char object into which has been stored any character other than a member of the basic execution character set is the least significant 8 bits of that character, interpreted as unsigned.
Which of signed char or unsigned char has the same range, representation, and behavior as plain char (6.2.5, 6.3.1.1).
Data items of type char are unsigned by default. The type unsigned char has the same range, representation, and behavior as char.
The mapping of members of the source character set (in character constants and string literals) to members of the execution character set (6.4.4.4, 5.1.1.2).
The execution character set is identical to the source character set.
The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character (6.4.4.4).
In C all character constants have type int. Up to four characters of the constant are represented in the integer value. The last character in the constant occupies the lowest-order byte of the integer value. Up to three preceding characters are placed at higher-order bytes. Unused bytes are filled with the NUL (\0) character.
The value of a wide-character constant containing more than one multibyte character or a single multibyte character that maps to multiple members of the extended execution character set, or containing a multibyte character or escape sequence not represented in the extended execution character set (6.4.4.4).
If a wide-character constant contains more than one multibyte character, all but the last such character are ignored.
The current locale used to convert a wide-character constant consisting of a single multibyte character that maps to a member of the extended execution character set into a corresponding wide-character code (6.4.4.4).
Mapping of wide-character constants to the corresponding wide-character code is locale independent.
Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the treatment of the resulting multibyte character sequence (6.4.5).
Differently prefixed wide string literal tokens cannot be concatenated.
The current locale used to convert a wide string literal into corresponding wide-character codes (6.4.5).
Mapping of the wide-characters in a wide string literal into the corresponding wide-character codes is locale independent.
The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set (6.4.5).
The compiler does not check if the value of a multibyte character or an escape sequence is a valid ISO/IEC 10646 code point. Such a value is encoded like the values of the valid members of the execution character set, according to the kind of the string literal (character or wide-character).
The encoding of any of wchar_t, char16_t, and char32_t where the corresponding standard encoding macro (__STDC_ISO_10646__, __STDC_UTF_16__, or __STDC_UTF_32__) is not defined (6.10.8.2).

The symbol __STDC_ISO_10646__ is not defined. Nevertheless every character in the Unicode required set, when stored in an object of type wchar_t, has the same value as the short identifier of that character.

The symbols __STDC_UTF_16__ and __STDC_UTF_32__ are defined.

Non-ConfidentialPDF file icon PDF versionDUI0774J
Copyright © 2014–2017, 2019 Arm Limited or its affiliates. All rights reserved.