Shift-JIS and UTF-8 implementation

The Shift-JIS and UTF-8 locales let you use Japanese and Unicode characters.

The following table shows the Shift-JIS (Japanese characters) or UTF-8 (Unicode characters) locale categories.

Table 1-8 Default Shift-JIS and UTF-8 locales

Function Description
__use_sjis_ctype Sets the character set to the Shift-JIS multibyte encoding of Japanese characters
__use_utf8_ctype Sets the character set to the UTF-8 multibyte encoding of all Unicode characters

The following list describes the effects of Shift-JIS and UTF-8 encoding:

  • The ordinary ctype functions behave correctly on any byte value that is a self-contained character in Shift-JIS. For example, half-width katakana characters that Shift-JIS encodes as single bytes between 0xA6 and 0xDF are treated as alphabetic by isalpha().

    UTF-8 encoding uses the same set of self-contained characters as the ASCII character set.

  • The multibyte conversion functions such as mbrtowc(), mbsrtowcs(), and wcrtomb(), all convert between wide strings in Unicode and multibyte character strings in Shift-JIS or UTF-8.

  • printf("%ls") converts a Unicode wide string into Shift-JIS or UTF-8 output, and scanf("%ls") converts Shift-JIS or UTF-8 input into a Unicode wide string.

Non-ConfidentialPDF file icon PDF versionARM 100073_0608_00_en
Copyright © 2014–2017 ARM Limited or its affiliates. All rights reserved.