Instruction 01

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

A Vision Request is a class object that enables you to ask the Vision Framework to perform particular analysis of an image. Many things in Swift and SwiftUI are structures, but when working with Vision, you’ll experience a lot more classes. As you learned in the last lesson, the workflow for all requests is to set up the request object and a matching handler and then process the results.

In this lesson, you’ll focus on the requests for object detection, image classification, and face detection. Recognizing text has some quirks, so it’s covered in the next lesson.

Choosing a Request Type

Apple provides a number of request types. All the request types inherit from the VNRequest class. VNRequest is an abstract class, meaning you never use it directly. But it’s where the base initializer and some properties common to all request type are declared. Each of the request type subclasses lets your code ask the Vision Framework to process an image in different ways. Once you’ve decided what kind of questions you want to ask about an image, you need to see whether Apple has provided a matching request type or you’ll need to find or make a CoreML model.

iOS 11

  • VNDetectBarcodesRequest and VNBarcodeObservation: Detects barcodes in an image.
  • VNDetectRectanglesRequest and VNRectangleObservation: Detects rectangular shapes such as business cards or book covers.
  • VNClassifyImageprintRequest and VNFeaturePrintObservation: Generates a print of the image for classification.
  • VNCoreMLRequest and VNCoreMLFeatureValueObservation: Uses a Core ML model to perform image analysis tasks.
  • VNDetectHorizonRequest and VNHorizonObservation: Detects the horizon in an image.
  • VNTrackObjectRequest and VNDetectedObjectObservation: Tracks a specified object across multiple frames.
  • VNTrackRectangleRequest and VNRectangleObservation: Tracks a specified rectangle across multiple frames.

iOS 12

  • VNDetectSaliencyImageRequest and VNSaliencyImageObservation: Detects contours in an image.
  • VNGenerateImageFeaturePrintRequest and VNFeaturePrintObservation: Generates a compact representation of an image’s visual content.
  • VNClassifyImageRequest and VNClassificationObservation: Classifies the overall content of an image.

iOS 13

  • VNDetectContoursRequest and VNContoursObservation: Detects contours in an image.
  • VNGenerateAttentionBasedSaliencyImageRequest and VNSaliencyImageObservation: Generates a saliency map indicating areas of an image likely to draw human attention.
  • VNGenerateObjectnessBasedSaliencyImageRequest and VNSaliencyImageObservation: Generates a saliency map highlighting objects in an image.
  • VNClassifyJunkRequest and VNClassificationObservation: Identifies and filters out junk or irrelevant content in images.
  • VNClassifySceneRequest and VNClassificationObservation: Classifies scenes in an image.

iOS 14

  • VNDetectTrajectoriesRequest and VNTrajectoryObservation: Detects and tracks the movement trajectories of objects.

iOS 15

  • VNRecognizeAnimalsRequest and VNRecognizedObjectObservation: Detects and recognizes animals in an image.
  • VNDetectAnimalRectanglesRequest and VNRecognizedObjectObservation: Detects bounding boxes of animals in an image.

Creating a Request

Once you’ve decided what request type you want to use, the next step is to create the request. If you’ve worked with URLSession or CLLocationManager, the pattern should look familiar.

import Vision

let request = VNDetectFaceRectanglesRequest { (request, error) in
  guard let observations = request.results as? [VNFaceObservation] else {
    print("No faces detected")
    return
  }
  // Process the observations
}

Creating the Handler

The request handler is different from the completion handler for the request. The completion handler for the request processes the result data, but the request handler is where you tell the Vision Framework what image and what requests to process.

let handler = VNImageRequestHandler(cgImage: image, options: [:])

do {
  try handler.perform([request])
} catch {
  print("Failed to perform request: \(error)")
}

Interpreting the Results

Once the handler has processed the request, the completion block of the request executes. The request.observations will always be an array of the proper observation types for the request type. Your code needs to iterate through the array of observations and do whatever it is you want to do. Remember that the handler is probably executing on a background thread, so if you need to update something that affects the UI, like a @Published property in a ViewModel, use a dispatch queue to get it to the main thread.

Using CoreML

As you’ve seen mentioned a few times, if Apple doesn’t provide a request type that fits your needs, you can always use a CoreML model. Working with a CoreML model requires only a few changes to your code. Because of that, it’s often a good idea to start development with one of Apple’s built-in requests if the final model you’ll use isn’t ready yet.

import CoreML

guard let model = try? VNCoreMLModel(for: Resnet50().model) else {
  fatalError("Failed to load ResNet50 model.")
}
let request = VNCoreMLRequest(model: model) { request, error in
  DispatchQueue.main.async {
    if let results = request.results as? [VNClassificationObservation] {
      // Sort and filter results by confidence
    }
  }
}
See forum comments
Download course materials from Github
Previous: Introduction Next: Demo 01