Chapters

Hide chapters

Data Structures & Algorithms in Swift

Third Edition · iOS 13 · Swift 5.1 · Xcode 11

Before You Begin

Section 0: 3 chapters
Show chapters Hide chapters

18. Tries
Written by Kelvin Lau

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

The trie (pronounced as try) is a tree that specializes in storing data that can be represented as a collection, such as English words:

A trie containing the words CAT, CUT, CUTE, TO, and B
A trie containing the words CAT, CUT, CUTE, TO, and B

Each character in a string is mapped to a node. The last node in each string is marked as a terminating node (a dot in the image above). The benefits of a trie are best illustrated by looking at it in the context of prefix matching.

In this chapter, you’ll first compare the performance of the trie to the array. Then you’ll implement the trie from scratch!

Example

You are given a collection of strings. How would you build a component that handles prefix matching? Here’s one way:

class EnglishDictionary {

  private var words: [String]
  
  func words(matching prefix: String) -> [String] {
    words.filter { $0.hasPrefix(prefix) }
  }
}

words(matching:) will go through the collection of strings and return the strings that match the prefix.

If the number of elements in the words array is small, this is a reasonable strategy. But if you’re dealing with more than a few thousand words, the time it takes to go through the words array will be unacceptable. The time complexity of words(matching:) is O(k*n), where k is the longest string in the collection, and n is the number of words you need to check.

Imagine the number of words Google needs to parse
Imagine the number of words Google needs to parse

The trie data structure has excellent performance characteristics for this type of problem; as a tree with nodes that support multiple children, each node can represent a single character.

You form a word by tracing the collection of characters from the root to a node with a special indicator — a terminator — represented by a black dot. An interesting characteristic of the trie is that multiple words can share the same characters.

To illustrate the performance benefits of the trie, consider the following example in which you need to find the words with the prefix CU.

First, you travel to the node containing C. That quickly excludes other branches of the trie from the search operation:

Next, you need to find the words that have the next letter U. You traverse to the U node:

Since that’s the end of your prefix, the trie would return all collections formed by the chain of nodes from the U node. In this case, the words CUT and CUTE would be returned. Imagine if this trie contained hundreds of thousands of words.

The number of comparisons you can avoid by employing a trie is substantial.

Implementation

As always, open up the starter playground for this chapter.

TrieNode

You’ll begin by creating the node for the trie. In the Sources directory, create a new file named TrieNode.swift. Add the following to the file:

public class TrieNode<Key: Hashable> {

  // 1
  public var key: Key?
  
  // 2
  public weak var parent: TrieNode?
  
  // 3
  public var children: [Key: TrieNode] = [:]

  // 4
  public var isTerminating = false

  public init(key: Key?, parent: TrieNode?) {
    self.key = key
    self.parent = parent
  }
}

Trie

Next, you’ll create the trie itself, which will manage the nodes. In the Sources folder, create a new file named Trie.swift. Add the following to the file:

public class Trie<CollectionType: Collection>
    where CollectionType.Element: Hashable {
  
  public typealias Node = TrieNode<CollectionType.Element>
  
  private let root = Node(key: nil, parent: nil)
  
  public init() {}
}

Insert

Tries work with any type that conforms to Collection. The trie will take the collection and represent it as a series of nodes in which each node maps to an element in the collection.

public func insert(_ collection: CollectionType) {
  // 1
  var current = root
  
  // 2
  for element in collection {
    if current.children[element] == nil {
      current.children[element] = Node(key: element, parent: current)
    }
    current = current.children[element]!
  }
  
  // 3
  current.isTerminating = true
}

Contains

contains is very similar to insert. Add the following method to Trie:

public func contains(_ collection: CollectionType) -> Bool {
  var current = root
  for element in collection {
    guard let child = current.children[element] else {
      return false
    }
    current = child
  }
  return current.isTerminating
}
example(of: "insert and contains") {
  let trie = Trie<String>()
  trie.insert("cute")
  if trie.contains("cute") {
    print("cute is in the trie")
  }
}
---Example of: insert and contains---
cute is in the trie

Remove

Removing a node in the trie is a bit more tricky. You need to be particularly careful when removing each node, since nodes can be shared between two different collections.

public func remove(_ collection: CollectionType) {
  // 1
  var current = root
  for element in collection {
    guard let child = current.children[element] else {
      return
    }
    current = child
  }
  guard current.isTerminating else {
    return
  }
  // 2
  current.isTerminating = false
  // 3
  while let parent = current.parent,
        current.children.isEmpty && !current.isTerminating {
      parent.children[current.key!] = nil
      current = parent
  }
}
example(of: "remove") {
  let trie = Trie<String>()
  trie.insert("cut")
  trie.insert("cute")
  
  print("\n*** Before removing ***")
  assert(trie.contains("cut"))
  print("\"cut\" is in the trie")
  assert(trie.contains("cute"))
  print("\"cute\" is in the trie")
  
  print("\n*** After removing cut ***")
  trie.remove("cut")
  assert(!trie.contains("cut"))
  assert(trie.contains("cute"))
  print("\"cute\" is still in the trie")
}
---Example of: remove---

*** Before removing ***
"cut" is in the trie
"cute" is in the trie

*** After removing cut ***
"cute" is still in the trie

Prefix matching

The most iconic algorithm for the trie is the prefix-matching algorithm. Write the following at the bottom of Trie.swift:

public extension Trie where CollectionType: RangeReplaceableCollection {
  
}
func collections(startingWith prefix: CollectionType) -> [CollectionType] {
  // 1
  var current = root
  for element in prefix {
    guard let child = current.children[element] else {
      return []
    }
    current = child
  }
  
  // 2
  return collections(startingWith: prefix, after: current)
}
private func collections(startingWith prefix: CollectionType,
                         after node: Node) -> [CollectionType] {
  // 1
  var results: [CollectionType] = []
  
  if node.isTerminating {
    results.append(prefix)
  }
  
  // 2
  for child in node.children.values {
    var prefix = prefix
    prefix.append(child.key!)
    results.append(contentsOf: collections(startingWith: prefix,
                                           after: child))
  }
  
  return results
}
example(of: "prefix matching") {
  let trie = Trie<String>()
  trie.insert("car")
  trie.insert("card")
  trie.insert("care")
  trie.insert("cared")
  trie.insert("cars")
  trie.insert("carbs")
  trie.insert("carapace")
  trie.insert("cargo")

  print("\nCollections starting with \"car\"")
  let prefixedWithCar = trie.collections(startingWith: "car")
  print(prefixedWithCar)

  print("\nCollections starting with \"care\"")
  let prefixedWithCare = trie.collections(startingWith: "care")
  print(prefixedWithCare)
}
---Example of: prefix matching---

Collections starting with "car"
["car", "carbs", "care", "cared", "cars", "carapace", "cargo", "card"]

Collections starting with "care"
["care", "cared"]

Key points

  • Tries provide great performance metrics in regards to prefix matching.
  • Tries are relatively memory efficient since individual nodes can be shared between many different values. For example, “car,” “carbs,” and “care” can share the first three letters of the word.
Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2024 Kodeco Inc.

You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now