Saturday 17 March 2018

Regular Expressions in Swift 4

I have recently been looking at converting some of my Xojo code to Swift, more of an exercise to see how easy it is, rather than anything productive.
However, I found that using regular expressions in Swift was not as straight forward as in some languages, like Xojo, Javascript and C#.

The following example will show what you need to do to find a pattern in a string and print each match to the console:

if let regPattern = try? NSRegularExpression(pattern: pattern, options: []) {
    let matches = regPattern.matches(in: text, options: [],
                                     range: NSRange(location:0, length: text.count))
    for match in matches {
        for n in 0..<match.numberOfRanges {
            let range = match.range(at: n)
            let start = self.index(self.startIndex,
                                   offsetBy: range.lowerBound)
            let end = self.index(self.startIndex,
                                 offsetBy: range.upperBound)
            let substring = String(self[start..<end])
            print(substring)
        }
    }

}

As you can see, this is a lot to do for a simple regex search.

To simplify this, I have created some extension methods for the String class, that allow you to accomplish this with far less effort. These methods overload each other, to allow reducing complexity, whilst still allowing the full range of options, should it be required:

extension String {

    func regex(_ pattern:String, options: NSRegularExpression.Options,
               matchingOptions: NSRegularExpression.MatchingOptions,
               range: NSRange) -> [[String]] {
        var result = [[String]]()
        if let regPattern = try? NSRegularExpression(pattern: pattern,
                                                     options: options) {
            let matches = regPattern.matches(in: self, options:
                matchingOptions, range: range)
            for match in matches {
                var matchArray = [String]()
                for n in 0..<match.numberOfRanges {
                    let range = match.range(at: n)
                    let start = self.index(self.startIndex,
                                           offsetBy: range.lowerBound)
                    let end = self.index(self.startIndex,
                                          offsetBy: range.upperBound)
                    let substring = String(self[start..<end])
                    matchArray.append(substring)
                }
                result.append(matchArray)
            }
        }
        return result
    }

    func regex(_ pattern:String, options: NSRegularExpression.Options,
               matchingOptions: NSRegularExpression.MatchingOptions)
        -> [[String]] {
        return self.regex(pattern, options: options,
                          matchingOptions: matchingOptions,
                          range:NSRange(location:0, length:self.count))
    }

    func regex(_ pattern:String, options: NSRegularExpression.Options)
        -> [[String]] {
        return self.regex(pattern, options: options, matchingOptions: [])
    }

    func regex(_ pattern:String) -> [[String]] {
        return self.regex(pattern, options: [])
    }

}

These methods return an array of match arrays. In each of the match arrays, the first element is the entire match, whilst each of the subsequent elements match the capture blocks from the regular expression.
Working from the top to bottom, the methods reduce the complexity exposed to the user, allowing them to only use the options required.
If we start with the first method, you can see that it takes the full range of options for a regular expression search. This can be called by:

let matches = text.regex(pattern, options: [],
                         matchingOptions: [],
                         range: NSRange(location:0, length:text.count))

The next method assumes the range is the full size of the string:

let matches = text.regex(pattern, options: [],
                         matchingOptions: [])

The next method assumes that the matching options are the default:

let matches = text.regex(pattern, options: [])

The final method assumes the regular expressions requires the default options:

let matches = text.regex(pattern)

As you can see, we can reduce the code down to a very manageable call.

To see the results of the match, we can simply iterate over the returned array:

for match in matches {
    for item in match {
        print(item)
    }
}

Now we no longer need to mess around with the ranges and manually extract the strings from the text.

No comments:

Post a Comment