Tuesday, July 26, 2011

Objective-C Tuesdays: Dynamic arrays

Another week, another Objective-C Tuesdays. Last week we began our series on data structures with a look at arrays in C and the NSArray class. Both C arrays and NSArray objects have serious limitations: C arrays are fixed in size and NSArrays are immutable. Today we will look at overcoming those limitations.

Dynamically allocated memory
C arrays are fixed in size when they are declared:
int lotteryNumbers[6];
Here we declare an array that can hold six ints. We can freely change the six int values we store in the array, but we can't make the array larger or smaller once it's declared. But there's nothing really magical about arrays in C: in essence they're blocks of memory managed by the compiler. When we declare an array, we tell the compiler the number of items we need to store and the type of item and it calculates the number of bytes of memory it needs to set aside for the array. We can do the same thing with a dynamically allocated memory block. Let's dynamically allocate the same amount of storage using malloc():
// dynamically allocating an array
#include <stdlib.h>

int *lotteryNumbers = malloc(sizeof(int) * 6);
if (lotteryNumbers) {
  lotteryNumbers[0] = 7;
  lotteryNumbers[1] = 11;
  // ...
}
The malloc() function is part of the standard C library, and it allocates a memory block on the heap. As we saw last time, you can use the same square bracket index notation with an array variable or a pointer to a block of memory.

Unlike global variables (which persist for the whole life span of your program) and local variables (which live only as long as the current function call), you control the life span of memory you allocate on the heap. Just as you need to match -retain with -release in Objective-C, you need to match calls to malloc() with corresponding calls to free():
// always free dynamically allocating arrays
#include <stdlib.h>
// ...

int *lotteryNumbers = malloc(sizeof(int) * 6);

// use lotteryNumbers for a while...

free(lotteryNumbers);
The malloc() function takes one argument: the number of bytes to allocate. Since most useful items require more than one byte each, you need to use the sizeof() operator to get the size of the item type and multiply it by the number of items required.
// calculate the number of bytes required
// using the sizeof() operator
int *lotteryNumbers = malloc(sizeof(int) * 6);
if (lotteryNumbers) {
  // ...
The malloc() function returns a pointer to the newly allocated memory block on success. If malloc() fails, it returns NULL. You should always check the result of memory allocation and take appropriate action. The typical C idiom is to use the returned pointer as a boolean value, since NULL pointers in C evaluate to false while non-NULL pointers are true, similar to nil values in Objective-C.
// always test the returned pointer
int *lotteryNumbers = malloc(sizeof(int) * 6);
if (lotteryNumbers) {
  // okay to use...
} else {
  // we're out of memory...
}

Handling malloc() failures
If a call to malloc() fails and returns NULL, it's almost always because you've run out of available memory. There are two broad strategies for coping with a memory allocation failure: fail fast or abort the operation. In general, I recommend that you fail fast by doing something like this:
// fail fast when out of memory

int *lotteryNumbers = malloc(sizeof(int) * 6);
if ( ! lotteryNumbers) {
  fprintf(stderr, "%s:%i: Out of memory\n", __FILE__, __LINE__);
  exit(EXIT_FAILURE);
}
// okay to use memory
lotteryNumbers[0] = 7;
// ...
In an Objective-C program, you would use NSLog() instead of fprintf(). When small to medium size memory allocations fail, the system is seriously constrained and there's not much else your program can do to cope. In fact, iOS will likely terminate your app before you ever reach this condition.

Sometimes your program is trying to do something particularly memory intensive, like editing a large image or sound file. In cases like this, you should be prepared for large memory allocations to fail and try to abort the operation gracefully. The strategy in this case is to free all resources allocated for the operation so far and alert the user.

Using calloc() instead of malloc()
When you dynamically allocate a memory block, you frequently want to set all the items to zero. malloc() doesn't do any initialization to the memory block, so the initial contents are effectively random garbage. The calloc() function is similar to malloc(), but also clears the bytes in the memory block to zeros before returning.
// using calloc()
size_t itemCount = 6;
size_t itemSize = sizeof(int);
int *lotteryNumbers = calloc(itemCount, itemSize);
if (lotteryNumbers) {
  // okay to use...
Unlike malloc() which takes the size of the memory block in bytes, calloc() takes the number of items and the size of each item and does the math for you. Under the covers, calloc() allocates memory from the same heap that malloc() uses, so you need to call free() on the memory block when you're done.

There's a lot more to managing dynamically allocated memory blocks. We'll look at resizing a memory block using realloc() next time but for now let's move on to a more pleasant topic: NSMutableArray.

NSMutableArray
Like its immutable super class NSArray, the NSMutableArray class takes mutable array management to a higher level. You can create an NSMutableArray the same way you create an NSArray:
NSMutableArray *colors = [NSMutableArray arrayWithObjects:@"red", 
                                                          @"green", 
                                                          @"blue", 
                                                          nil];
Another common creation technique is to duplicate an existing immutable NSArray using the +arrayWithArray: or -initWithArray: methods.
NSArray *rgbColors = [NSArray arrayWithObjects:@"red", 
                                               @"green", 
                                               @"blue", 
                                               nil];
NSMutableArray *colors = [NSMutableArray arrayWithArray:rgbColors];
Often, you simply want an empty array to start with. The +array or -init methods from NSArray will do the trick here. (You can create empty immutable NSArray objects this way too, they're just usually not very useful.)

Adding items to the end of the array is easily done with -addObject: and -addObjectsFromArray:
NSMutableArray *colors = [NSMutableArray array];
// colors is empty

[colors addObject:@"yellow"];
[colors addObject:@"purple"];
// colors holds yellow, purple

NSArray *designerColors = [NSArray arrayWithObjects:@"mauve", 
                                                    @"chartreuse", 
                                                    @"seafoam", 
                                                    nil];
[colors addObjectsFromArray:designerColors];
// colors now holds yellow, purple, mauve, chartreuse and seafoam

NSMutableArray has many ways to remove objects. The -removeLastObject method is the inverse of -addObject:. The -removeObjectAtIndex: method removes an item at a particular index. Continuing with our array of colors:
// colors holds yellow, purple, mauve, chartreuse and seafoam
[colors removeLastObject];
// colors holds yellow, purple, mauve and chartreuse
[colors removeObjectAtIndex:0];
// colors holds purple, mauve and chartreuse

That covers the basics of adding and removing objects from an NSMutableArray. Next time, we'll cover more ways to manipulate the mutable array contents.

Wednesday, July 20, 2011

OS X Lion Internet Recovery

I just came across this page about Lion Recovery. I had read about the recovery partition earlier this morning in the excellent Ars Technica Lion review by John Siracusa, but new Macs have a feature called "Internet Recovery" that lets you automatically download a Lion recovery disk image from Apple's servers, even if you've wiped your hard drive:
If your Mac problem is a little less common — your hard drive has failed or you’ve installed a hard drive without OS X, for example — Internet Recovery takes over automatically. It downloads and starts Lion Recovery directly from Apple servers over a broadband Internet connection. And your Mac has access to the same Lion Recovery features online. Internet Recovery is built into every newly-released Mac starting with the Mac mini and MacBook Air.
This is one of those things that makes using a Mac so friction free. A big thumbs up to Apple for this feature.

Burning A Lion Boot Disc

If you're planning to install OS X Lion, but you want the safety and security of a physical install disk, or you prefer to wipe the hard drive and do a clean install, you can create your own bootable DVD installer using the Lion installer app from the Mac App Store and the Disk Utility application on your current Mac. Thomas Brand shows you how.

Forgotten C: The comma operator

Some good advice on the comma operator and when to use it from Jerry Ryle at MindTribe:
Comma operator?! Isn’t that thing just a separator? Nope. It’s occasionally an operator.
Read more about the rarely used comma operator in C. And MindTribe is looking to hire some great embedded developers.

Tuesday, July 19, 2011

Objective-C Tuesdays: arrays

Welcome back to Objective-C Tuesdays. Last time we wrapped up our series on strings by looking at regular expressions in Objective-C. Today we begin a new series: data structures. The first data structure that we will examine is the array.

Most languages have some concept of an array, though it is sometimes called a list. In general, an array is an ordered sequence: a collection of items that has a distinct order. The term "array" implies that each item in the collection is individually accessible in constant time; in other words, it takes the same amount of time to access items at the beginning, middle or end of the sequence.

Arrays in C
The C language includes the ability to define and create strongly typed arrays. You must always declare the type of the items that the array contains.
// declare an array of ints
int lotteryNumbers[6];
This declares an array that holds six integers. When the array variable is declared, the number between the square brackets indicates the number of items that the array can hold, usually referred to as the size, length or count.

C99 (which is the default for iOS projects) allows you to use a function parameter or other variable to determine the length of an array. This feature is naturally called "variable length arrays".
// use a variable as the length of an array
int length = 6;
int lotteryNumbers[length];
Unless you're an old time C programmer, you're probably thinking "yeah, so what?" In earlier versions of C, you could only declare arrays with constant length. In old C code (or code written old C hands), it's common to see code like this:
// constant array length
#define LOTTERY_NUMBERS_LENGTH 6

int lotteryNumbers[LOTTERY_NUMBERS_LENGTH];

C array initialization
You can optionally provide an array with an initializer. An array initializer uses curly braces and looks like this:
// array initializer
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };
You can specify fewer items than the array can hold; the remaining items will be initialized to zero.
// array with partial initializer
int lotteryNumbers[6] = { 7, 11 };
You can even specify an empty initializer and all the items in the array will be set to zero.
// array with empty initializer
int lotteryNumbers[6] = {};
This is redundant for arrays declared at global scope since global variables are initialized to zero by default, but can be useful for local variables.

If you use an initializer list when you declare your array, you can leave the array length out of the declaration:
// array initializer without length
int lotteryNumbers[] = { 7, 11, 19, 23, 29, 31 };
The compiler will count the items in the initializer list and size your array to fit.

If you are initializing an array of chars, you can use a string literal as the initializer.
char favoriteColor[4] = "red";
char favoriteFlavor[] = "vanilla";
Remember that C strings contain an extra char, the null terminator, so when you initialize an array of chars with the string "red", it actually stores four items. You can make the equivalent initializers using character literals:
char favoriteColor[4] = { 'r', 'e', 'd', 0 };
char favoriteFlavor[] = { 'v', 'a', 'n', 'i', 'l', 'l', 'a', '\0' };

Accessing items in an array
You get items out of an array by using an array index number in square brackets:
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };

NSLog(@"%i is at index 1", lotteryNumbers[1]);
NSLog(@"%i is at index 2", lotteryNumbers[2]);
The code snippet above produces this output:
11 is at index 1
19 is at index 2
If this is surprising, it's because the first item in a C array is always at index 0.

Assigning items to an array is naturally very similar:
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };

lotteryNumbers[1] = 13;
lotteryNumbers[2] = 17;
NSLog(@"%i is at index 1", lotteryNumbers[1]);
NSLog(@"%i is at index 2", lotteryNumbers[2]);
which will print out:
13 is at index 1
17 is at index 2

Arrays automatically convert to pointers
Like all things in C, arrays are very low level constructs. Under the hood, an array is simply a block of memory managed by the compiler that's large enough to hold all its items. Because an array corresponds directly to a memory block, an array variable will automatically convert into a pointer to the first item in the array.
// automatic array to pointer conversion
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };
int *luckyNumber = lotteryNumbers;

NSLog(@"My lucky number is %i", *luckyNumber
This will produce the output:
My lucky number is 7
You can set the pointer to items after the first by using pointer arithmetic:
// pointer arithmetic
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };
int *luckyNumber = lotteryNumbers;

NSLog(@"My NEW lucky number is %i", *(luckyNumber + 1)
which prints out the
My NEW lucky number is 11
While pointer arithmetic is a perfectly cromulent way to access items in an array, you can use an index in square brackets on a pointer just as you can on an array:
// array index using a pointer variable
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };
int *luckyNumber = lotteryNumbers;

NSLog(@"My NEW lucky number is %i", luckyNumber[1]
Under the hood, the compiler automatically converts uses of an array variable into a pointer to the first item in the array, then converts array index expressions into the equivalent pointer arithmetic. An expression like myArray[2] is converted to *(myArray + 2), or a pointer to the third item in the array. (Remember that the first item is myArray[0].)

While this is a very convenient way to work with low level memory in a structured way, it can also be very dangerous. The compiler won't stop you from accessing items past the end of your array. You can easily and efficiently read memory that may contain garbage values and overwrite memory belonging to other parts of your program. As with many things in C, with great power comes great responsibility.

Calculating the length of an array
The sizeof operator can be applied to an array variable to find out how many bytes of memory the array occupies.
// size of an array in bytes
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };

NSLog(@"The array uses %lu bytes", sizeof lotteryNumbers);
This produces:
The lotteryNumbers array uses 24 bytes
ints in iOS use four bytes each, so an array of six ints uses 24 bytes. (The sizeof operator returns a value of type size_t, an unsigned long integer type.) To get the number of items the array contains, you can divide the size of the array by the size of its first item:
// size of an array in bytes
int lotteryNumbers[6] = { 7, 11, 19, 23, 29, 31 };

size_t length = sizeof lotteryNumbers / sizeof lotteryNumbers[0];
NSLog(@"The array contains %lu items", length);

Be careful to only use this on actual array variables—you won't get the answer you expect if you try this on a pointer.

C arrays have one big limitation: you can't resize them. We'll talk about using dynamically allocated memory blocks as arrays next time.

NSArray
If you're ready for a safer, higher level way to manage an ordered sequence of items, it's time to get to know NSArray.

You can create an NSArray containing one item using +arrayWithObject: or -initWithObject:
// creating an NSArray with one item
NSArray *colors = [NSArray arrayWithObject:@"red"];
NSArray *flavors = [[NSArray alloc] initWithObject:@"vanilla"];
NSArray creation methods follow the common Cocoa and Cocoa Touch conventions. Class methods like +arrayWithObject: return a new autoreleased object. If you need to hold on to the object beyond the current method, remember to call -retain on it. Instance methods like -initWithObject: produce new objects that you own—don't forget to call -release or -autorelease on the object when you're done with it.

Sometimes an array containing one item is handy, but usually you want to hold onto multiple items. To do that, use +arrayWithObjects: or -initWithObjects::
// creating an NSArray with multiple items
NSArray *colors = [NSArray arrayWithObjects:@"red", 
                                            @"green", 
                                            @"blue", 
                                            nil];
NSArray *flavors = [[NSArray alloc] initWithObjects:@"vanilla", 
                                                    @"chocolate", 
                                                    @"strawberry", 
                                                    nil];
The +arrayWithObjects: or -initWithObjects: methods take a variable number of arguments, but have a special caveat: you must mark the end of the list with nil. If you forget the nil, your program will probably crash with an EXC_BAD_ACCESS error as it tries to add random memory locations to the NSArray. Fortunately, the LLVM 2.0 compiler in Xcode 4.0 will warn you if you forget the nil. Always pay attention to compiler warnings!

If you have a plain old C array of object pointers, you can use the +arrayWithObjects:count: or -initWithObjects:count methods to create an NSArray from the C array.
// creating an NSArray from a C array
NSString *colors1[] = { @"red", @"green", @"blue" };
NSArray *colors2 = [NSArray arrayWithObjects:colors1 count:3];

NSString *flavors1[] = { @"vanilla", @"chocolate", @"strawberry" };
NSArray *flavors2 = [[NSArray alloc] initWithObjects:flavors1 count:3];

Accessing an item in an array is done with the -objectAtIndex: method.
// my favorite color is green
NSArray *colors = [NSArray arrayWithObjects:@"red", 
                                            @"green", 
                                            @"blue", 
                                            nil];

NSLog(@"My favorite color is %@", [colors objectAtIndex:1]);
You can ask an NSArray object how many items it contains using the -count method.
NSArray *favoriteColors = [NSArray arrayWithObject:@"green"];
NSArray *favoriteFlavors = [NSArray arrayWithObjects:@"vanilla", @"chocolate", nil];

NSLog(@"I have %u favorite color(s) and %u favorite flavor(s)", 
      [favoriteColors count], [favoriteFlavors count]);

Only for objects, but heterogeneous
NSArray has one big limitation: it can only contain object types. If you try to create an NSArray of ints or a similar primitive C type, you'll get a warning like "Incompatible integer to pointer conversion sending 'int' to parameter of type 'id'". If you need to store numbers in an NSArray, you can store NSNumber objects instead.
// wrap primitive types in objects to store them in an NSArray
NSNumber *one = [NSNumber numberWithInt:1];
NSNumber *pi = [NSNumber numberWithDouble:3.14];
NSDate *today = [NSDate date];
NSString *foo = [NSString stringWithContentsOfCString:"foo" encoding:NSUTF8Encoding];

NSArray *myStuff = [NSArray arrayWithObjects:one, pi, today, foo, nil];
This example shows an interesting characteristic of NSArrays: they can hold items of many different object types in a single container. While occasionally this feature is useful, more often than not you'll only store items of one type in a given NSArray, and errors related to finding an unexpected type in your NSArrays are pretty rare.

NSArrays are immutable
Like plain old C arrays, once you create an NSArray, you can't change its size. But NSArrays are even more restrictive; you can't change the contents either. NSArrays are immutable. This is something of a pain in the rump, but it means that you can safely share an NSArray between threads, as long as the items in it are also immutable.

Next time, we'll look at using memory blocks as resizable C arrays, and NSArray's more flexible cousin, NSMutableArray.

Objective-C Tuesdays: strings in Objective-C

Last week we wrapped up our survey of strings by examining regular expressions in Objective-C. Here's a quick overview of the posts that cover strings:

String basics:

String operations:

Data structures is the next topic, which we start today.

Tuesday, July 12, 2011

Objective-C Tuesdays: regular expressions

Welcome back to Objective-C Tuesdays after a long hiatus. In the last couple of entries, we looked at searching and replacing in C strings and NSStrings. Today we'll look at a more powerful way to search and replace in strings: regular expressions.

A mini language
Regular expressions is a small, specialized programming language for matching text. In addition to being specialized in scope, regular expressions are also very compact, which can make them very hard to read. I won't cover regular expression grammar in dept here—there are plenty of regular expression tutorials, references and cheat sheets out there. If you're looking for a good reference, I recommend O'Reilly's Mastering Regular Expressions and the Regular Expressions Cookbook.

Some programming languages have first class support for regular expressions, including Perl, Ruby and JavaScript. Modern languages like Python and Java support regular expressions as part of the language's standard library. Unfortunately C doesn't have regular expression support as part of its standard library. For Objective-C, regular expression support first appeared with iOS 3.2 on iPad and 4.0 on iPhone. Before that, developers needed to use a third party library to use regular expressions in their apps.

Some simple examples
Most characters in regular expressions match themselves, but some characters have special meaning. Here's a simple regular expression that matches the word "foobar":
foobar
To match "foo" followed by one 'b' character, you could use:
foob?
Here, the '?' character is a special character that modifies the expression before it, (in this case the character 'b') and matches it zero or one times. Thus foob? will match "foo" or "foob" but not "foe".

To match "foo" followed by zero or more 'b' characters, you could use:
foob*
The '*' special character matches the preceding expression zero or more times. foob* will match "foo", "foob" and "foobbbbbb" but not "fod".

To match "foo" followed by one or more 'b' characters, you could use:
foob+
The '+' special character matches the preceding expression one or more times. foob+ will match "foob" and "foobbbbbb" but not "foo" or "food".

Regular expression languages have many more capabilities. Though different regular expression implementations have different capabilities, most share a large set of common operations,.

C libraries for regular expressions
The PCRE or Perl Compatible Regular Expressions library is a widely used C library that implements the same regular expression language that's used in Perl 5. The PCRE library is open source and distributed under a BSD license. It's a substantial library, and due to the lack of support on iOS for OS X style frameworks, it can be a bit challenging to include in an iOS project. In addition, because it's a C library, working with PCRE requires more low-level code than many Cocoa Touch programmers are comfortable with.

Here's a PCRE code snippet that compiles the regular expression foo(bar|fy) and tests it against the string "foofy".
#include 
#include 
#include 

#define OUTPUT_VECTOR_COUNT 30    /* should be a multiple of 3 */

/* ... */

char const *pattern = "foo(bar|fy)";
int compileOptions = 0;
char const *error;
int errorOffset;
unsigned char const *characterTable = NULL;
pcre *regularExpression = pcre_compile(pattern, compileOptions, 
                                       &error, &errorOffset, characterTable);

if ( ! regularExpression) {
  NSLog(@"ERROR: regular expression <%s> failed to compile\n", pattern);
  NSLog(@"  Error at offset %i: %s\n", errorOffset, error);
  exit(EXIT_FAILURE);
}

pcre_extra *extraData = NULL;
char const *subject = "foofy";
int subjectLength = strlen(subject);
int subjectOffset = 0;
int execOptions = 0;
int outputVector[OUTPUT_VECTOR_COUNT];
int execResultCount = pcre_exec(regularExpression, extraData, 
                                subject, subjectLength, subjectOffset, 
                                execOptions, outputVector, OUTPUT_VECTOR_COUNT);

if (execResultCount == PCRE_ERROR_NOMATCH) {
  NSLog(@"The subject <%s> did not match the pattern <%s>\n", subject, pattern);
} else if (execResultCount < 0) {
  NSLog(@"Unexpected pcre_exec() result %i\n", execResultCount);
} else if (execResultCount == 0) {
  NSLog(@"Output vector only has room for %i captured substrings\n", 
        execResultCount - 1);
} else {
  int resultIndex;
  
  NSLog(@"Found match for pattern <%s> in subject <%s> at offset %i\n", 
        pattern, subject, outputVector[0]);
  for (resultIndex = 0; resultIndex < execResultCount; ++resultIndex) {
    int substringIndex = 2 * resultIndex;
    int substringStartingOffset = outputVector[substringIndex];
    int substringEndingOffset = outputVector[substringIndex + 1];
    char const* substringStart = subject + substringStartingOffset;
    int substringLength = substringEndingOffset - substringStartingOffset;
    NSLog(@"match %i: <%.*s>\n", resultIndex, substringLength, substringStart);
  }
}

pcre_free(regularExpression);

/* ... */
As you can see, it takes a lot of boilerplate code to do even a simple regular expression search using PCRE, but it's a very powerful library used by many well knows open source and commercial projects, including Apple's Safari browser. The PCRE project also includes a set of C++ wrappers that make the library easier to use from that language. If you only plan to target OS X, there is also the very excellent RegexKit open source library that provides a nice Objective-C wrapper around PCRE. The PCRE library can be built with Unicode support, but is limited to scanning UTF-8 encoded Unicode strings, since it's a byte oriented library.

The POSIX Regular Expressions specification describes a regular expression dialect and C interface that is supported on many variants of Unix and Linux, including OS X and iOS. Like many specifications, POSIX regular expressions are a "lowest common denominator" solution and lack some of the advanced features found in regular expression variants like Perl and PCRE. The POSIX regular expression interface is much smaller than what PCRE provides: four functions, two structs and a bunch of constants. Here's a PCRE code snippet that compiles the regular expression foo(bar|fy) and tests it against the string "foofy" using the POSIX interface.
#import 

#define MATCH_COUNT 10

/* ... */

regex_t regularExpression;
char const *pattern = "foo(fy|bar)";
int compileFlags = REG_EXTENDED;

int compileResult = regcomp(&regularExpression, pattern, compileFlags);

if (compileResult) {
  size_t bufferSize = regerror(compileResult, &regularExpression, NULL, 0);
  char *buffer = malloc(bufferSize);
  if ( ! buffer) {
    NSLog(@"Memory allocation failed");
    return;
  }
  regerror(compileResult, &regularExpression, buffer, bufferSize);
  NSLog(@"%s", buffer);
  free(buffer);
  return;
}

char const *string = "foofy";
regmatch_t matches[MATCH_COUNT];
int executeFlags = 0;

int executeResult = regexec(&regularExpression, string, 
                            MATCH_COUNT, matches, executeFlags);

if (executeResult == REG_NOMATCH) {
  NSLog(@"Pattern <%s> doesn't match string <%s>\n", pattern, string);
} else {  
  NSLog(@"Pattern <%s> matches string <%s>\n", pattern, string);
  NSLog(@"Found %lu submatches\n", (unsigned long)regularExpression.re_nsub);
  for (size_t i = 0; i < regularExpression.re_nsub + 1; ++i) {
    int substringLength = matches[i].rm_eo - matches[i].rm_so;
    char const *substringStart = string + matches[i].rm_so;
    NSLog(@"submatch %lu: <%.*s>\n", (unsigned long)i, substringLength, substringStart);
  }
}

regfree(&regularExpression);

/* ... */
This is a little more concise than the PCRE version, mainly because the POSIX functions have fewer options, but is still very low level. POSIX regular expression implementations are generally regarded as slower than PCRE, and don't provide any explicit Unicode support (though you may be able to do regex matching against UTF-8 encoded strings if you're careful to keep your UTF-8 strings normalized).

The ICU or International Components for Unicode library is included as part of iOS. (ICU is open source and carries a non-restrictive license.) The ICU is a general purpose library for working with Unicode text which includes Unicode-aware regular expression support. It has two versions: one for Java and another for C and C++. The C/C++ version features a low level C function interface and a set of higher level C++ classes. I'll only look at the C interface to ICU here.

Because the ICU library is a general purpose Unicode library, most operations require that C strings be converted into ICU's Unicode text type, UChar *, which makes the ICU example the longest yet:
// compile regular expression
char const *pattern = "foo(bar|fy)";
uint32_t compileFlags = 0;
UParseError parseError;
UErrorCode errorCode = U_ZERO_ERROR;

URegularExpression* regularExpression = uregex_openC(pattern, compileFlags, 
                                                     &parseError, &errorCode);
if (errorCode) {
  NSLog(@"uregex_openC() failed: %li: %s", 
        (long)errorCode, u_errorName(errorCode));
  NSLog(@"  parse error at line %li, offset %li" ,
        (long)parseError.line, (long)parseError.offset);
  return;
}

// determine size of search text as ICU Unicode string
UChar *unicodeText = NULL;
int32_t unicodeTextCapacity = 0;
int32_t unicodeTextLength;
char const *utf8Text = "foofy";
int32_t utf8TextLength = -1; /* null terminated */

errorCode = U_ZERO_ERROR;
u_strFromUTF8(unicodeText, unicodeTextCapacity, &unicodeTextLength, 
              utf8Text, utf8TextLength, &errorCode);

if (errorCode != U_BUFFER_OVERFLOW_ERROR) {
  NSLog(@"Conversion to Unicode string failed: %li: %s", 
        (long)errorCode, u_errorName(errorCode));
  uregex_close(regularExpression);
  return;
}

// allocate buffer for search text ICU Unicode string
unicodeTextCapacity = unicodeTextLength + 1;
unicodeText = calloc(sizeof(UChar), unicodeTextLength);
if ( ! unicodeText) {
  NSLog(@"Memory allocation failed");
  uregex_close(regularExpression);
  return;
}

// convert search text to ICU Unicode string
errorCode = U_ZERO_ERROR;
u_strFromUTF8(unicodeText, unicodeTextCapacity, &unicodeTextLength, 
              utf8Text, utf8TextLength, &errorCode);

uregex_setText(regularExpression, unicodeText, unicodeTextLength, &errorCode);
if (errorCode) {
  NSLog(@"uregex_setText() failed: %li: %s", 
        (long)errorCode, u_errorName(errorCode));
  free(unicodeText);
  uregex_close(regularExpression);
  return;
}

// search for regular expression
int32_t startIndex = 0;
errorCode = U_ZERO_ERROR;
BOOL matchFound = uregex_find(regularExpression, startIndex, &errorCode);
if (errorCode) {
  NSLog(@"uregex_find() failed: %li: %s", 
        (long)errorCode, u_errorName(errorCode));
  free(unicodeText);
  uregex_close(regularExpression);
  return;
}

if (matchFound) {
  // get number of subgroup matches
  NSLog(@"Pattern <%s> matched string <%s>", pattern, utf8Text);
  errorCode = U_ZERO_ERROR;
  int32_t subgroupCount = uregex_groupCount(regularExpression, &errorCode);
  if (errorCode) {
    NSLog(@"uregex_groupCount() failed: %li: %s", 
          (long)errorCode, u_errorName(errorCode));
    free(unicodeText);
    uregex_close(regularExpression);
    return;
  }
  
  // enumerate subgroup matches
  NSLog(@"Matched %li subgroups", (long)subgroupCount);
  for (int32_t i = 0; i <= subgroupCount; ++i) {
    // get size of the subgroup
    UChar *subgroup = NULL;
    int32_t subgroupCapacity = 0;
    errorCode = U_ZERO_ERROR;
    int32_t subgroupLength = uregex_group(regularExpression, i, 
                                          subgroup, subgroupCapacity, 
                                          &errorCode);
    if (errorCode != U_BUFFER_OVERFLOW_ERROR) {
      NSLog(@"uregex_group() failed: %li: %s", 
            (long)errorCode, u_errorName(errorCode));
      break;
    }
    
    // allocate buffer to hold the subgroup
    subgroupCapacity = subgroupLength + 1;
    subgroup = calloc(sizeof(UChar), subgroupCapacity);
    if ( ! subgroup) {
      NSLog(@"Memory allocation failed");
      return;
    }
    
    // copy subgroup into buffer
    errorCode = U_ZERO_ERROR;
    uregex_group(regularExpression, i, subgroup, subgroupCapacity, &errorCode);
    
    // determine size of buffer to hold subgroup as UTF8 string
    char *utf8Subgroup = NULL;
    int32_t utf8SubgroupCapacity = 0;
    int32_t utf8SubgroupLength;
    errorCode = U_ZERO_ERROR;
    u_strToUTF8(utf8Subgroup, utf8SubgroupCapacity, &utf8SubgroupLength, 
                subgroup, subgroupLength, &errorCode);
    if (errorCode != U_BUFFER_OVERFLOW_ERROR) {
      NSLog(@"u_strToUTF8() failed: %li: %s", 
            (long)errorCode, u_errorName(errorCode));
      free(subgroup);
      break;
    }
    
    // allocate buffer to hold subgroup as UTF8 string
    utf8SubgroupCapacity = utf8SubgroupLength + 1;
    utf8Subgroup = calloc(sizeof(char), utf8SubgroupCapacity);
    if ( ! utf8Subgroup) {
      NSLog(@"Memory allocation failed");
      free(subgroup);
      break;
    }
    
    // convert subgroup to UTF8 string
    errorCode = U_ZERO_ERROR;
    u_strToUTF8(utf8Subgroup, utf8SubgroupCapacity, &utf8SubgroupLength, 
                subgroup, subgroupLength, &errorCode);
    if (errorCode) {
      NSLog(@"u_strToUTF8() failed: %li: %s", 
            (long)errorCode, u_errorName(errorCode));
      free(subgroup);
      free(utf8Subgroup);
      break;
    }
    
    // print subgroup UTF8 string
    NSLog(@"submatch %lu: <%.*s>\n", 
          (unsigned long)i, utf8SubgroupLength, utf8Subgroup);
    free(subgroup);
    free(utf8Subgroup);
  }
} else {
  NSLog(@"Pattern <%s> did not match string <%s>", pattern, utf8Text);
}

free(unicodeText);
uregex_close(regularExpression);
I wouldn't be surprised if there's a memory leak in there somewhere. I'm also no expert on this library, and I may be doing some things the hard way. If you're using C++, I'm sure a lot of this boilerplate code goes away. If you don't need or want to mix C++ into your iOS app, there are some Objective-C alternatives that are much more satisfying.

Regular expressions in Objective-C
In iOS 3.2, Apple added a new NSStringCompareOptions value for use in the various -rangeOfString:options: methods of NSString: NSRegularExpressionSearch. This option uses the regular expression support of the ICU library to do simple regular expression matching on an NSString object. Unfortunately, Apple only implemented a very minimal regular expression interface on NSString. You can search, but not replace, within an NSString and powerful regular expression features like subgroup matching are not exposed.
NSString *string = @"foofy";
NSString *pattern = @"foo(bar|fy)";

NSRange match = [string rangeOfString:pattern 
                              options:NSRegularExpressionSearch];

if (match.location == NSNotFound) {
  NSLog(@"Pattern <%@> doesn't match string <%@>", pattern, string);
} else {
  NSLog(@"Pattern <%@> matches string <%@> starting at location %lu",
        pattern, string, (unsigned long)match.location);
}

In iOS 4.0, Cocoa Touch gained the NSRegularExpression class, a full-featured regular expression processor built on top of the ICU regular expression library.
NSError *error;
NSString *pattern = @"foo(bar|fy)";

NSRegularExpression *regularExpression = [NSRegularExpression regularExpressionWithPattern:pattern 
                                                                                   options:0 
                                                                                     error:&error];
if ( ! regularExpression) {
  NSLog(@"Error in pattern <%@>: %@", pattern, error);
  return;
}

NSString *string = @"foofy";
NSRange range = NSMakeRange(0, [string length]);
NSArray *matches = [regularExpression matchesInString:string 
                                              options:0 
                                                range:range];
if ([matches count]) {
  NSTextCheckingResult *firstMatch = [matches objectAtIndex:0];
  NSLog(@"Found %lu submatches", (unsigned long)[firstMatch numberOfRanges]);
  for (NSUInteger i = 0; i < [firstMatch numberOfRanges]; ++i) {
    NSRange range = [firstMatch rangeAtIndex:i];
    NSString *submatch = [string substringWithRange:range];
    NSLog(@"submatch %lu: <%@>", (unsigned long)i, submatch);      
  }
} else {
  NSLog(@"Pattern <%@> doesn't match string <%@>", pattern, string);
}
This is much easier to use than the underlying ICU C library and a good choice for iOS apps that target 4.0 and later.

The RegexKitLite library provides an alternate Objective-C wrapper around the low level ICU regular expression library that supports both Mac OS X and iOS, including iOS 3.0. RegexKitLite provides a full featured set of regular expression methods in a category that extends NSString and can do matching, searching, replacing and supports subgroups. RegexKitLite is open source and available under a BSD license.
NSString *string = @"foofy";
NSString *pattern = @"foo(bar|fy)";

NSArray *submatches = [string captureComponentsMatchedByRegex:pattern];
if ([submatches count]) {
  NSLog(@"Found %lu submatches", (unsigned long)[submatches count]);
  for (NSUInteger i = 0; i < [submatches count]; ++i) {
    NSLog(@"submatch %lu: <%@>", (unsigned long)i, [submatches objectAtIndex:i]);      
  }
} else {
  NSLog(@"Pattern <%@> doesn't match string <%@>", pattern, string);
}
RegexKitLite code is a little more concise than the NSRegularExpression class, but comparable in power, since they're both built on top of ICU.

Which one to use?
Unless you're writing cross-platform code in C or need to target very old versions of iOS, I recommend you avoid using the C regular expression libraries. If you must, the POSIX regex functions are the easiest to get started with, but if you're targeting non-Apple platforms, watch out for implementation differences across platforms. Both PCRE and ICU are good cross-platform choices; choose ICU if you also need robust Unicode support, PCRE if you're mainly working with eight bit encodings.

On the Objective-C side, choosing NSRegularExpression or RegexKitLite is largely up to personal preference. NSRegularExpression is a no-brainer if you're targeting iOS 4.0 and later (but currently not supported on OS X). Adding RegexKitLite to your project is as easy as adding two files to Xcode and currently works on both Apple operating systems. And the -rangeOfString:options: method on NSString is handy for simple searches.

<eot /> (end of topic)
That concludes our look at strings in Objective-C. Data structures is our next topic.