Vision Framework Tutorial for iOS: Scanning Barcodes
In this Vision Framework tutorial, you’ll learn how to use your iPhone’s camera to scan QR codes and automatically open encoded URLs in Safari. By Emad Ghorbaninia.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Vision Framework Tutorial for iOS: Scanning Barcodes
20 mins
- Getting Started
- Getting Permission to Use Camera
- Starting an AVCaptureSession
- Setting Capture Session Quality
- Defining a Camera for Input
- Making an Output
- Running the Capture Session
- Vision Framework
- Vision and the Camera
- Using the Vision Framework
- Creating a Vision Request
- Vision Handler
- Vision Observation
- Adding a Confidence Score
- Using Barcode Symbology
- Opening Encoded URLs in Safari
- Setting Up Safari
- Opening Safari
- Where to Go From Here?
Vision and the Camera
The Vision Framework operates on still images. Of course, when you use the camera on your iPhone, the image moves smoothly, as you would expect from video. However, video is made up of a collection of still images played one after the other, almost like a flip book.
When using your camera with the Vision Framework, Vision splits the moving video into its component images and processes one of those images at some frequency called the sample rate.
In this tutorial, you’ll use the Vision Framework to find barcodes in images. The Vision Framework can read 17 different barcode formats, including UPC and QR codes.
In the coming sections, you’ll instruct your app to find QR codes and read their contents. Time to get started!
Using the Vision Framework
To implement the Vision Framework in your app, you’ll follow three basic steps:
-
Request: When you want to detect something using the framework, you use a subclass of
VNRequest
to define the request. -
Handler: You process that request and perform image analysis for any detection using a subclass of
VNImageRequestHandler
. -
Observation: You analyze the results of your handled request with a subclass of
VNObservation
.
Time to create your first Vision request.
Creating a Vision Request
Vision provides VNDetectBarcodesRequest
to detect a barcode in an image. You’ll implement it now.
In ViewController.swift, find // TODO: Make VNDetectBarcodesRequest variable
at the top of the file and add the following code right after it:
lazy var detectBarcodeRequest = VNDetectBarcodesRequest { request, error in
guard error == nil else {
self.showAlert(
withTitle: "Barcode error",
message: error?.localizedDescription ?? "error")
return
}
self.processClassification(request)
}
In this code, you set up a VNDetectBarcodesRequest
that will detect barcodes when called. When the method thinks it found a barcode, it’ll pass the barcode on to processClassification(_:)
. You’ll define processClassification(_:)
in a moment.
But first, you need to revisit video and sample rates.
Vision Handler
Remember that video is a collection of images, and the Vision Framework processes one of those images at some frequency. To set up your video feed accordingly, find setupCameraLiveView()
and locate the TODO you left earlier: // TODO: Set video sample rate
. Then, add this code right after the comment, and before the call to addOutput(_:)
:
captureOutput.videoSettings =
[kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)]
captureOutput.setSampleBufferDelegate(
self,
queue: DispatchQueue.global(qos: DispatchQoS.QoSClass.default))
In this code, you set your video stream’s pixel format to 32-bit BGRA. Then, you set self
as the delegate for the sample buffer. When new images are available in the buffer, Vision calls the appropriate delegate method from AVCaptureVideoDataOutputSampleBufferDelegate
.
Because you’ve passed self
as the delegate, you must conform ViewController
to AVCaptureVideoDataOutputSampleBufferDelegate
. Your class already does this and has a single callback method defined: captureOutput(_:didOutput:from:)
. Find this method and insert the following after // TODO: Live Vision
:
// 1
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
// 2
let imageRequestHandler = VNImageRequestHandler(
cvPixelBuffer: pixelBuffer,
orientation: .right)
// 3
do {
try imageRequestHandler.perform([detectBarcodeRequest])
} catch {
print(error)
}
Here you:
- Get an image out of sample buffer, like a page out of a flip book.
- Make a new
VNImageRequestHandler
using that image. - Perform the
detectBarcodeRequest
using the handler.
Vision Observation
Think back to the section on the Vision Request. There, you built the detectBarcodeRequest
which calls processClassification(_:)
if it thinks it found a barcode. For your last step, you’ll fill out processClassification(_:)
to analyze the result of the handled request.
In processClassification(_:)
, locate // TODO: Main logic
and add the following code right below it:
// 1
guard let barcodes = request.results else { return }
DispatchQueue.main.async { [self] in
if captureSession.isRunning {
view.layer.sublayers?.removeSubrange(1...)
// 2
for barcode in barcodes {
guard
// TODO: Check for QR Code symbology and confidence score
let potentialQRCode = barcode as? VNBarcodeObservation
else { return }
// 3
showAlert(
withTitle: potentialQRCode.symbology.rawValue,
// TODO: Check the confidence score
message: potentialQRCode.payloadStringValue ?? "" )
}
}
}
In this code, you:
- Get a list of potential barcodes from the request.
- Loop through the potential barcodes to analyze each one individually.
- If one of the results happens to be a barcode, you show an alert with the barcode type and the string encoded in the barcode.
Build and run again. This time, point your camera at a barcode.
Booooom you scanned it!
So far, so good. But what if there was a way for you to know with what certainty the object you point at is actually a barcode? More on this next…
Adding a Confidence Score
So far, you’ve worked extensively with AVCaptureSession
and the Vision Framework. But there are more things you can do to tighten your implementation. Specifically, you can limit your Vision Observation to recognize only QR type barcodes and you can make sure the Vision Framework is certain it’s found a QR code in an image.
Whenever your barcode observer analyzes the result of a handled request, it sets a property called confidence
. This property tells you the result’s confidence level, normalized to [0, 1], where 1 is the most confident.
Inside processClassification(_:)
, find// TODO: Check for QR Code symbology and confidence score
and replace the guard:
guard
let potentialQRCode = barcode as? VNBarcodeObservation
else { return }
with:
guard
// TODO: Check for QR Code symbology and confidence score
let potentialQRCode = barcode as? VNBarcodeObservation,
potentialQRCode.confidence > 0.9
else { return }
Here you ensure Vision is at least 90% confident it’s found a barcode.
Now in the same method, locate // TODO: Check the confidence score
. The message
key’s value below is currently potentialQRCode.payloadStringValue ?? ""
. Change it to:
String(potentialQRCode.confidence)
Now, instead of showing the barcode’s payload in the alert, you’ll show the confidence score. Because the score is a number, you coalesced the value to a string so it can display in the alert.
Build and run. When you scan the sample QR code, you’ll see the confidence score in the alert that pops up.
Nicely done! As you see, the confidence score of the sample QR code is quite high, meaning Vision is certain this is actually a QR code.