Swift Regex Tutorial: Getting Started
Master the pattern-matching superpowers of Swift Regex. Learn to write regular expressions that are easy to understand, work with captures and try out RegexBuilder, all while making a Marvel Movies list app! By Ehab Amer.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Swift Regex Tutorial: Getting Started
30 mins
- Getting Started
- Understanding Regular Expressions
- Swiftifying Regular Expressions
- Loading the Marvel Movies List
- Reading the Text File
- Defining the Separator
- Defining the Fields
- Matching a Row
- Looking Ahead
- Capturing Matches
- Naming Captures
- Transforming Data
- Creating a Custom Type
- Conditional Transformation
- Where to Go From Here?
Creating a Custom Type
Notice that Production Year displays Not Produced and Premiered On shows Not Announced, even for old movies and shows. This is because you haven't implemented the parsing of their data yet so .unknown is returned for their values.
The production year won't always be a single year:
- If it's a movie, the year will be just one value, for example: (2010).
- If it's a TV show, it can start in one year and finish in another: (2010-2012).
- It could be an ongoing TV show: (2010- ).
- Marvel Studios may not have announced a date yet, making it truly unknown: (I).
The value for PremieredOnInfo is similar:
- An exact date may have been set, such as: Oct 10, 2010.
- An exact date may not yet be set for a future movie or show, in which case only the year is defined: 2023.
- Dates may not yet be announced: -.
This means the data for these fields can have different forms or patterns. This is why you captured them as text and didn't specify what exactly to expect in the expression.
You'll create an expression for each possibility and compare it with the value provided. The option you'll set is the expression that matches the string as a whole.
For example, if the movie is in the future and only a year is mentioned in the Premiered On field, then the expression that's expecting a word and two numbers with a comma between them will not succeed. Only the expression that is expecting a single number will.
Conditional Transformation
Start breaking down what you'll do with the year field. The value in the three cases will be within parentheses:
- If it's a single year, the expression is:
\(\d+\)
. - If it's a range between two years, it's two numbers separated by a dash:
\(\d+-\d+\)
. - If it's an open range starting from a year, it's a digit, a dash then a space:
\(\d+-\s\)
.
Open ProductionYearInfo.swift and change the implementation of fromString(_:)
to:
public static func fromString(_ value: String) -> Self {
if let match = value.wholeMatch(of: /\((?<startYear>\d+)\)/) { // 1
return .produced(year: Int(match.startYear) ?? 0) // 2
} else if let match = value.wholeMatch(
of: /\((?<startYear>\d+)-(?<endYear>\d+\))/) { // 3
return .finished(
startYear: Int(match.startYear) ?? 0,
endYear: Int(match.endYear) ?? 0) // 4
} else if let match = value.wholeMatch(of: /\((?<startYear>\d+)–\s\)/) { // 5
return .onGoing(startYear: Int(match.startYear) ?? 0) // 6
}
return .unknown
}
Earlier, you read that there is a different way to name captured values using the regex literal. Here is how.
To capture a value in a regex, wrap the expression in parentheses. Name it using the syntax (?<name>regex)
where name is how you want to refer to the capture and regex is the regular expression to be matched.
Time to break down the code a little:
- You compare the value against an expression that expects one or more digits between parentheses. This is why you escaped one set of parentheses.
- If the expression is a full match, you capture just the digits without the parentheses and return
.produced
using the captured value and casting it to anInt
. Notice the convenience of using the captured value. - If it doesn't match the first expression, you test it against another that consists of two numbers with a dash between them.
- If that matches, you return
.finished
and use the two captured values as integers. - If the second expression didn't match, you check for the third possibility, a number followed by a dash, then a space to represent a show that's still running.
- If this matches, you return
.onGoing
using the captured value.
Each time, you use wholeMatch
to ensure the entire input string, not just a substring inside it, matches the expression.
Build and run. See the new field properly reflected on the UI.
Next, open PremieredOnInfo.swift and change the implementation of fromString(_:)
there to:
public static func fromString(_ value: String) -> Self {
let yearOnlyRegexString = #"\d{4}"# // 1
let datesRegexString = #"\S{3,4}\s.{1,2},\s\d{4}"#
guard let yearOnlyRegex = try? Regex(yearOnlyRegexString),
let datesRegex = try? Regex(datesRegexString) else { // 2
return .unknown
}
if let match = value.wholeMatch(of: yearOnlyRegex) { // 3
let result = match.first?.value as? Substring ?? "0"
return .estimatedYear(Int(result) ?? 0)
} else if let match = value.wholeMatch(of: datesRegex) { // 4
let result = match.first?.value as? Substring ?? ""
let dateStr = String(result)
let date = Date.fromString(dateStr)
return .definedDate(date)
}
return .unknown
}
This time, instead of regular expression literals, you store each regular expression in a String and then create Regex objects from those strings.
- You create two expressions to represent the two possible
value
cases you expect, either a four-digit year, or a full date consisting of a three-character full or shortened month name or a four-character month name, followed by one or two characters for the date, followed by a comma, a whitespace and a four-digit year. - Create the two Regex objects that you'll use to compare against.
- Try to match the whole string against the
yearOnlyRegexString`
. If it matches, you return.estimatedYear
and use the provided value as the year. - Otherwise, you try to match the whole string against the other Regex object,
datesRegexString
. If it matches, you return.definedDate
and convert the provided string to a date using a normal formatter.
\n
where n is the number of the capture. Take care to access the captures safely in the correct order.
Build and run. Behold the Premiered On date correctly displayed. Nice work!
Where to Go From Here?
Understanding regular expressions can turn you into a string manipulation superhero! You'll be surprised at what you can achieve in only a few lines of code. Regular expressions may challenge you initially, but they're worth it. There are many resources to help, including An Introduction to Regular Expressions, which links to even more resources!
To learn more about Swift Regex, check out this video from WWDC 2022.
You can download the completed project files by clicking the Download Materials button at the top or bottom of this tutorial.
Swift's Regex Builders are a kind of result builder. Learn more about this fascinating topic in Swift Apprentice Chapter 20: Result Builders.
We hope you enjoyed this tutorial. If you have any questions or comments, please join the forum discussion below!