Welcome back to Objective-C Tuesdays! Today we follow closely on last week's topic of
searching in strings with it's sibling, replacing in strings.
It's a nightmare in C
In our series on strings in Objective-C, we've usually started by looking at C strings then moved on to
NSString
s. Today is no different. In most cases, using
NSString
is easier than doing the equivalent operation on C strings. When it comes to replacing characters in a string, using
NSString
is
significantly easier and safer. The standard C library doesn't provide much support for doing common string replacement operations, so you have to implement them yourself. Because of all the manual memory management required when working with C strings, this code is very error prone -- writing off the end of a buffer and forgetting to add the null terminator are two very common types of errors you have to watch out for when working with C strings.
Replacing a character
The only replacement operation that's fairly straightforward on C strings is replacing a single character with another character. Since C strings are just pointers to arrays of
char
s, you simply calculate the pointer to the
char
you want to change, dereference the pointer and assign the new
char
value.
There are two variations of this. The first one uses array notation and the second pointer operations. In both examples below, we use the
strdup()
function to make a copy of our original C string. The
strdup()
function isn't part of the C standard library, but most systems have one available (possibly named
_strdup()
) and it's easy to write one if it's missing on your system (it's available on iOS). You own the string returned by
strdup()
and are responsible for calling
free()
when you're done with it.
Here's how you change a character in a C string by treating it as an array of
char
s:
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
copy[3] = 'B'; // change char at index 3
NSLog(@"copy = %@", copy);
// prints "copy = fooBar"
free(copy); // free copy when done
The alternative way uses pointer arithmetic:
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
char *c3 = copy + 3; // get pointer to char at index 3
*c3 = 'B'; // change char at address of c3
NSLog(@"copy = %@", copy);
// prints "copy = fooBar"
free(copy); // free copy when done
As far as the compiler is concerned, this is basically the same code so use whichever method makes the most sense. If you know the index of the
char
you want to change, use array notation. If you already have a pointer to the
char
, perhaps from calling
strchr()
, use the pointer directly.
Replacing a substring
Replacing a substring of a C string is harder. In the case where the original and the replacement have the same number of
char
s, you can call
strncpy()
to copy over the characters.
// replacing a substring of equal length
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
char *c2 = copy + 2; // get pointer to char at index 2
strncpy(c2, "OBA", 3); // copy 3 chars
NSLog(@"copy = %s", copy);
// prints "copy = foOBAr"
free(copy); // free copy when done
Replacing a substring with a different sized one is even more complex. There are three special cases that need to be handled: the substring to replace is at the start of the original, in the middle, or at the end. When the replacement substring is smaller than the original, there are some short cuts you can take to make the code a little simpler, but we'll only show the general case.
We'll look at the second case, replacing a substring in the middle of the original. With a little extra logic, this code can be adapted to handle all three of our cases.
char const *source = "The rain in Spain";
char const *original = "rain"; // substring to find
char const *replacement = "plane"; // substring to replace
// calculate the required buffer size
// including space for the null terminator
size_t size = strlen(source) - strlen(original)
+ strlen(replacement) + sizeof(char);
// allocate buffer
char *buffer = calloc(size, sizeof(char));
if ( ! buffer) {
// handle allocation failure
}
// find original substring in source and
// calculate the length of the unchanged prefix
char *originalInSource = strstr(source, original);
size_t prefixLength = originalInSource - source;
// copy prefix "The " into buffer
strncpy(buffer, source, prefixLength);
// calculate where the replacement substring goes in the buffer
char *replacementInBuffer = buffer + prefixLength;
// copy replacement "plane" into buffer
strcpy(replacementInBuffer, replacement);
// find position of unchanged suffix in source and
// calculate where it goes in the buffer
char const *suffixInSource = originalInSource + strlen(original);
char *suffixInBuffer = replacementInBuffer + strlen(replacement);
// copy suffix " in Spain" into buffer
strcpy(suffixInBuffer, suffixInSource);
NSLog(@"buffer = %s", buffer);
// prints "buffer = The plane in Spain"
free(buffer); // free buffer when done
I won't even waste your time explaining this in detail.
No one programming in a modern computer language should have to write this code! It's extremely error prone and is one of the main causes of security vulnerabilities. If you find yourself doing this,
stop immediately and seek out one of the
many managed string libraries for C that are available. If you're writing code for iOS, you should be using
NSString
to do this.
Replacing using NSString
The
NSString
class has a number of useful methods for replacing characters and substrings in an
NSString
. Because
NSString
is immutable, these methods all return a new
NSString
instance containing the replacements, leaving the source
NSString
unchanged.
When you know the exact area of the string you want to replace, you can use the
-stringByReplacingCharactersInRange:withString:
method with an
NSRange
structure, which has fields for
location
(the zero-based index to start at) and
length
(the number of characters in the source string to replace). Because
NSString
does all the memory management for you and returns a new autoreleased
NSString
, it's child's play compared to doing this with C strings.
// replace a range in an NSString
NSString *source = @"The rain in Spain";
NSRange range;
range.location = 4; // starting index in source
range.length = 3; // number of characters to replace in source
NSString *copy = [source stringByReplacingCharactersInRange:range
withString:@"trai"];
NSLog(@"copy = %@", copy);
// prints "copy = The train in Spain"
// no need to release anything
// copy is autoreleased
This is a definite improvement over working with C strings. You might actually do this in real code without tearing your hair out or causing a
buffer overrun bug. We can make this code even more compact by using the
NSMakeRange()
function to create the
NSRange
structure.
// replace a range in an NSString
NSString *source = @"The rain in Spain";
// create range in line
NSString *copy = [source stringByReplacingCharactersInRange:NSMakeRange(4, 3)
withString:@"trai"];
NSLog(@"copy = %@", copy);
// prints "copy = The train in Spain"
// no need to release anything
// copy is autoreleased
If you don't know ahead of time what part of the string you want to replace, you can do a find and replace in one method. The
-stringByReplacingOccurrencesOfString:withString:
method will find
all occurrences of one
NSString
in another and replace them, returning a new autoreleased
NSString
.
// find and replace one substring with another
NSString *source = @"The rain in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"ain"
withString:@"oof"];
NSLog(@"copy = %@", copy);
// prints "copy = The roof in Spoof"
There is another variation of this method that gives you more control over how substrings are found and replaced. The
-stringByReplacingOccurrencesOfString:withString:options:range:
method allows you to specify a mask containing one or more options and an
NSRange
structure allowing you to restrict the operation to a section of the string. The most common option is
NSCaseInsensitiveSearch
, which matches the substring without regard to case.
// case insensitive replace
NSString *source = @"<BR>The rain<BR>in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"<br>"
withString:@"<p>"
options:NSCaseInsensitiveSearch
range:NSMakeRange(0, [source length])];
NSLog(@"copy = %@", copy);
// prints "copy = "<p>The rain<p>in Spain"
Another handy search option is
NSAnchoredSearch
, which searches only at the start of the source string. Notice that you use the bitwise or (
|
) operator to combine multiple options together.
// anchored, case insensitive replace
NSString *source = @"<BR>The rain<BR>in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"<br>"
withString:@"<p>"
options:NSAnchoredSearch | NSCaseInsensitiveSearch
range:NSMakeRange(0, [source length])];
NSLog(@"copy = %@", copy);
// prints "copy = "<p>The rain<BR>in Spain"
You can combine the
NSBackwardsSearch
with
NSAnchoredSearch
to only replace the substring if it occurs at the end of the source instead of at the beginning.
Replacing in NSMutableString
If you're working with an
NSMutableString
, you can still call any of the
-stringByReplacing...
methods to produce a new
NSString
, but you have the option of making the replacements in the
NSMutableString
directly. The method
-replaceCharactersInRange:withString:
is very similar to the
-stringByReplacingCharactersInRange:withString
method:
// replace a range in an NSMutableString
NSMutableString *source = [NSMutableString stringWithString:@"The rain in Spain"];
[source replaceCharactersInRange:NSMakeRange(4, 3)
withString:@"trai"];
NSLog(@"source = %@", source);
// prints "source = The train in Spain"
The method
-replaceOccurrencesOfString:withString:options:range:
works similarly.
In most cases, there's not much of an advantage to replacing in place in an
NSMutableString
versus creating a new
NSString
containing the replacement. Use whichever operation is most convenient. If you need to make many replacements on a very long string, there
may be an advantage to replacing in place rather than creating many large temporary
NSString
instances that live in the autorelease pool.
So far, the searching and replacing methods we've seen have done only simple string matching. Next week, we'll look at
more powerful string matching using regular expressions.