See Also
Strings in Objective-C
Last week we looked at Unicode escape sequences in C string and Strings in Objective-C
NSString
literal. Today we'll take a quick overview of wide character strings and talk about where they fit into the iOS development.When the C language was developed in the early 1970's, little thought was given to representing non-English languages. By default, most C compilers assumed that both code files and application output used 7-bit ASCII encoding and that each logical character in a string fit into a single 8-bit byte or
char
value. By the time C was first standardized by ANSI in 1989 (and by ISO in 1990), the need to handle many more characters than ASCII was obvious, but the Unicode standard was still nascent. So the ANSI C committee included a wide character type and wide character string functions in the C89 standard, but didn't tie wide character support to any specific character encoding scheme.wchar_t
C89 introduced a new integer type,
wchar_t
. This is similar to a char
, but typically "wider". On many systems, including Windows, a wchar_t
is 16 bits. This is typical of systems that implemented their Unicode support using earlier versions of the Unicode standard, which originally defined fewer than 65,535 characters. Unicode was later expanded to support historical and special purpose character sets, so on some systems, including Mac OS X and iOS, the wchar_t
type is 32 bits in size. This is often poorly documented, but you can use a simple test like this to find out:// how big is wchar_t? NSLog(@"wchar_t is %u bits wide", 8 * sizeof(wchar_t));On a Mac or iPhone, this will print "wchar_t is 32 bits wide". Additionally,
wchar_t
is a typedef
for another integer type in C. In C++, wchar_t
is a built-in integer type. In practice, this means you need to #include <wchar.h>
in C when using wide characters.signed or unsigned?
The
char
integer type is almost always a signed integer with a range from -128 to 127. You can use the CHAR_MIN
and CHAR_MAX
constants defined in <limits.h>
to find out the range for a particular compiler:NSLog(@"CHAR_MIN = %0.f", (double)CHAR_MIN); NSLog(@"CHAR_MAX = %0.f", (double)CHAR_MIN);
The
wchar_t
type can be signed or unsigned. The WCHAR_MIN
and WCHAR_MAX
constants hold the range of a wchar_t
and are defined in both <wchar.h>
and <stdint.h>
.NSLog(@"WCHAR_MIN = %0.f", (double)WCHAR_MIN); NSLog(@"WCHAR_MAX = %0.f", (double)WCHAR_MIN);On Windows,
wchar_t
is an unsigned 16-bit integer. On Mac and iPhone, wchar_t
is a signed 32-bit integer, so the code above will print out "WCHAR_MAX = 2147483647" and "WCHAR_MIN = -2147483648". For the most part you don't need to worry about whether wchar_t
is signed or unsigned; it only becomes important if you need to do comparisons and operations that mix wchar_t
with other integer types (a rarity).wide character literals
We looked at C string literals in previous entries. Wide character string literals are very similar, but are prefixed with 'L':
// example of a wide character string literal wchar_t const *s = L"foobarf!";Like C string literals, wide strings separated by only whitespace are considered one logical string:
// wide strings written in segments wchar_t const *s1 = L"foo" "bar"; wchar_t const *s2 = L"Hello, " L"world!";
wide character functions
Most string functions in the standard C library are defined in the
<string.h>
header. A very similar set of functions for wide character strings are defined in <wchar.h>
. The functions follow a similar naming convention. Where string functions are prefixed with str
, the wide character equivalents are prefixed with wcs
(for wide character string). So the strlen()
function calculates the length of a string and the corresponding wcslen()
function calculates the length of a wide character string.not used much
In practice, you won't use wide character strings very often in Objective-C since the
NSString
class does just about everything wide character strings are meant to do, but you may occasionally run across them in other C libraries.Next time, we'll begin looking at common string operations using C strings and
NSString
s, starting with string concatenation.
No comments:
Post a Comment