Finding the words

Recently I was challenged to build some code that could brake up an NSString into an array of NSStrings that would fit into a given width in an iOS app. Seems fairly straight forward, given that Apple has it in their text GUI elements. No dice my friends. While Apple does give you the ability to discover the dimensions of an NSString given a UIFont and, optionally, some size constraints it wont tell you how to break up the NSString into the individual lines.

So, what is one to do? Time to replicate some existing Apple capability. The basic idea is to split up a string into words, then build a line by adding a word at a time until it’s too long, then repeat until you’ve run out of tokens.

Splitting a string into words isn’t easy. My first stop was to use ParseKit. It’s a nice fast parser for Objective-C but it seems to have lost its support and hasn’t been updated in some time. The fact that it hasn’t been updated in a while made it a bit harder to integrate into the project. After a bit of pushing and pulling I managed to stuff it into the project.

Over all the ParseKit worked well enough and its compiled library wasn’t particularly big but it was a pain to install and has quiet a few warnings. I ended up fixing a bit of code and also adding a bunch of warning suppression so that I could work in peace.

I got the code to work but I wasn’t happy with all the extra code that was hanging around to make ParseKit work. So it was back to the drawing board. I dug around the net but didn’t find anything that I was happy with, time to reinvent the wheel.

What I ended up doing was making use of Objective-C regular expression engine, NSRegularExpression, and hand built a regular expression.

    NSMutableString *regexPattern = [NSMutableString string];
    // A number or dollar amount
    [regexPattern appendString:@"(\\$?\\d+(\\.\\d+)?)"];
    // or
    [regexPattern appendString:@"|"];
    // possibly starting with a quote, a word, possibly trailing by punctuation
    [regexPattern appendString:@"\\\"?([\\w\\']*)(\\.\\.\\.|\\.|,|-|!|\\?|:|;|\\\")?"];
    NSError *error = NULL;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regexPattern options:NSRegularExpressionCaseInsensitive error:&error];

Lines 3,5, and 7 or the different parts of the regular expression I built to parse words. Line 3 matches numbers as well as dollar values. line 5 is a simple ‘or’, and line 7 matches a word, consuming the leading quote if there is one, consuming trailing punctuation if that exists. The rest of the code in that method simply builds an array of all the tokens.

Here is the full code for turning a string into an array of words:

- (NSArray *)stringToWords{
    NSMutableString *regexPattern = [NSMutableString string];
    // A number or dollar amount
    [regexPattern appendString:@"(\\$?\\d+(\\.\\d+)?)"];
    // or
    [regexPattern appendString:@"|"];
    // possibly starting with a quote, a word, possibly trailing by punctuation
    [regexPattern appendString:@"\\\"?([\\w\\']*)(\\.\\.\\.|\\.|,|-|!|\\?|:|;|\\\")?"];
    NSError *error = NULL;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regexPattern options:NSRegularExpressionCaseInsensitive error:&error];

    NSArray *matches = [regex matchesInString:self
                                      options:0
                                        range:NSMakeRange(0, [self length])];
    NSMutableArray *tokens = [NSMutableArray array];
    NSUInteger endOfLastToken = 0;
    for (NSTextCheckingResult *match in matches) {
        NSRange matchRange = [match range];
        if (matchRange.location != endOfLastToken){
            NSRange tempRange = NSMakeRange(endOfLastToken, matchRange.location - endOfLastToken);
            [tokens addObject:[self substringWithRange:tempRange]];
        }
        [tokens addObject:[self substringWithRange:matchRange]];
        endOfLastToken = matchRange.location + matchRange.length;
    }

    return tokens;
}

And here is the full code for using that array of tokens to an array of lines:

- (NSArray *)linesConstrainedToWidth:(CGFloat)maxWidth withFirstLineIndent:(CGFloat)indent andFont:(UIFont *)font{
    CGSize maxSize = CGSizeMake(maxWidth - indent, FLT_MAX);

    NSMutableArray *strings = [NSMutableArray array];
    NSMutableString *newString = [NSMutableString string];
    NSMutableString *oldString = [NSMutableString string];

    NSArray *tokens = [self stringToWords];
    for (NSString *token in tokens){
        //NSLog(@"(%@) (%.1f) : %@", token.stringValue, token.floatValue, [token debugDescription]);
        [newString appendString:token];
        CGSize size = [newString sizeWithFont:font constrainedToSize:maxSize lineBreakMode:UILineBreakModeWordWrap];
        if (size.height > font.lineHeight){
            [strings addObject:oldString];
            newString = [NSMutableString stringWithString:token];
            maxSize = CGSizeMake(maxWidth, 2000.0);
        } else {
            oldString = [NSMutableString stringWithString:newString];
        }
    }
    [strings addObject:newString];

    return strings;
}

As you can see this method allows you to constrain the line to a specific width as well as setting a first line indent, just in case.

Leave a Reply

Your email address will not be published. Required fields are marked *