Vision Tutorial for iOS: What’s New With Face Detection?
Learn what’s new with Face Detection and how the latest additions to Vision framework can help you achieve better results in image segmentation and analysis. By Tom Elliott.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Vision Tutorial for iOS: What’s New With Face Detection?
35 mins
- Getting Started
- A Tour of the App
- Reviewing the Vision Framework
- Looking Forward
- Processing Faces
- Debug those Faces
- Selecting a Size
- Detecting Differences
- Masking Mayhem
- Assuring Quality
- Handling Quality Result
- Detecting Quality
- Offering Helpful Hints
- Segmenting Sapiens
- Using Metal
- Building Better Backgrounds
- Handling the Segmentation Request Result
- Removing the Background
- Saving the Picture
- Saving to Camera Roll
- Where to Go From Here?
Saving to Camera Roll
Next, in captureOutput(_:didOutput:from:)
, immediately before initializing detectFaceRectanglesRequest
, add the following:
if isCapturingPhoto {
isCapturingPhoto = false
savePassportPhoto(from: imageBuffer)
}
Here, you reset the isCapturingPhoto
flag if needed and call a method to save the passport photo with the data from the image buffer.
Finally, write the implementation for savePassportPhoto(from:)
:
// 1
guard let model = model else {
return
}
// 2
imageProcessingQueue.async { [self] in
// 3
let originalImage = CIImage(cvPixelBuffer: pixelBuffer)
var outputImage = originalImage
// 4
if model.hideBackgroundModeEnabled {
// 5
let detectSegmentationRequest = VNGeneratePersonSegmentationRequest()
detectSegmentationRequest.qualityLevel = .accurate
// 6
try? sequenceHandler.perform(
[detectSegmentationRequest],
on: pixelBuffer,
orientation: .leftMirrored
)
// 7
if let maskPixelBuffer = detectSegmentationRequest.results?.first?.pixelBuffer {
outputImage = removeBackgroundFrom(image: originalImage, using: maskPixelBuffer)
}
}
// 8
let coreImageWidth = outputImage.extent.width
let coreImageHeight = outputImage.extent.height
let desiredImageHeight = coreImageWidth * 4 / 3
// 9
let yOrigin = (coreImageHeight - desiredImageHeight) / 2
let photoRect = CGRect(x: 0, y: yOrigin, width: coreImageWidth, height: desiredImageHeight)
// 10
let context = CIContext()
if let cgImage = context.createCGImage(outputImage, from: photoRect) {
// 11
let passportPhoto = UIImage(cgImage: cgImage, scale: 1, orientation: .upMirrored)
// 12
DispatchQueue.main.async {
model.perform(action: .savePhoto(passportPhoto))
}
}
}
It looks like a lot of code! Here's what's happening:
- First, return early if the model hasn't been set up.
- Next, dispatch to a background queue to keep the UI snappy.
- Create a core image representation of the input image and a variable to store the output image.
- Then, if the user has requested the background to be removed...
- Create a new person segmentation request, this time without a completion handler. You want the best possible quality for the passport photo, so set the quality to
accurate
. This works here because you're only processing a single image, and you're performing it on a background thread. - Perform the segmentation request.
- Read the results synchronously. If a mask pixel buffer exists, remove the background from the original image. Do this by calling
removeBackgroundFrom(image:using:)
, passing it the more accurate mask. - At this point,
outputImage
contains the passport photo with the desired background. The next step is to set the width and height for the passport photo. Remember the passport photo may not have the same aspect ratio as the camera. - Calculate the frame of the photo, using the full width and the vertical center of the image.
- Convert the output image (a Core Image object) to a Core Graphics image.
- Then, create a
UIImage
from the core graphics image. - Dispatch back to the main thread and ask the model to perform the save photo action.
Done!
Build and run. Align your face properly and take a photo with and without background hiding enabled. After taking a photo, a thumbnail will appear on the right-hand side of the footer. Clicking the thumbnail will load a detail view of the image. If you open the Photos app, you'll also find your photos saved to the camera roll.
Note how the quality of the background replacement is better in the still image that it was in the video feed.
Where to Go From Here?
In this tutorial, you learned how to use the updated Vision framework in iOS 15 to query roll, pitch and yaw in real time. You also learned about the new person segmentation APIs.
There are still ways to improve the app. For example, you could look at using Core Image's smile detector to prevent smiling photos. Or you could invert the mask to check if the real background is white when not hiding the background.
You could also look at publishing hasDetectedValidFace
through a Combine stream. By throttling the stream, you could stop the UI from flickering fast when a face is on the edge of being acceptable.
The Apple documentation is a great resource for learning more about the Vision framework. If you want to learn more about Metal, try this excellent tutorial to get you started.
We hope you enjoyed this tutorial. If you have any questions or comments, please join the forum discussion below!