Vision Tutorial for iOS: What’s New With Face Detection?
Learn what’s new with Face Detection and how the latest additions to Vision framework can help you achieve better results in image segmentation and analysis. By Tom Elliott.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Vision Tutorial for iOS: What’s New With Face Detection?
35 mins
- Getting Started
- A Tour of the App
- Reviewing the Vision Framework
- Looking Forward
- Processing Faces
- Debug those Faces
- Selecting a Size
- Detecting Differences
- Masking Mayhem
- Assuring Quality
- Handling Quality Result
- Detecting Quality
- Offering Helpful Hints
- Segmenting Sapiens
- Using Metal
- Building Better Backgrounds
- Handling the Segmentation Request Result
- Removing the Background
- Saving the Picture
- Saving to Camera Roll
- Where to Go From Here?
Processing Faces
Next, update processUpdatedFaceGeometry()
by replacing the faceFound()
case with the following:
case .faceFound(let faceGeometryModel):
let roll = faceGeometryModel.roll.doubleValue
let pitch = faceGeometryModel.pitch.doubleValue
let yaw = faceGeometryModel.yaw.doubleValue
updateAcceptableRollPitchYaw(using: roll, pitch: pitch, yaw: yaw)
Here you pull the roll, pitch and yaw values for the detected face as doubles from the faceGeometryModel
. You then pass these values to updateAcceptableRollPitchYaw(using:pitch:yaw:)
.
Now add the following into the implementation stub of updateAcceptableRollPitchYaw(using:pitch:yaw:)
:
isAcceptableRoll = (roll > 1.2 && roll < 1.6)
isAcceptablePitch = abs(CGFloat(pitch)) < 0.2
isAcceptableYaw = abs(CGFloat(yaw)) < 0.15
Here, you set the state for acceptable roll, pitch and yaw based on the values pulled from the face.
Finally, replace calculateDetectedFaceValidity()
to use the roll, pitch and yaw values to determine if the face is valid:
hasDetectedValidFace =
isAcceptableRoll &&
isAcceptablePitch &&
isAcceptableYaw
Now, open FaceDetector.swift. In detectedFaceRectangles(request:error:)
, replace the definition of faceObservationModel
with the following:
let faceObservationModel = FaceGeometryModel(
boundingBox: convertedBoundingBox,
roll: result.roll ?? 0,
pitch: result.pitch ?? 0,
yaw: result.yaw ?? 0
)
This simply adds the now required roll
, pitch
and yaw
parameters to the initialization of the FaceGeometryModel
object.
Debug those Faces
It would be nice to add some information about the roll, pitch and yaw to the debug view so that you can see the values as you're using the app.
Open DebugView.swift and replace the DebugSection
in the body declaration with:
DebugSection(observation: model.faceGeometryState) { geometryModel in
DebugText("R: \(geometryModel.roll)")
.debugTextStatus(status: model.isAcceptableRoll ? .passing : .failing)
DebugText("P: \(geometryModel.pitch)")
.debugTextStatus(status: model.isAcceptablePitch ? .passing : .failing)
DebugText("Y: \(geometryModel.yaw)")
.debugTextStatus(status: model.isAcceptableYaw ? .passing : .failing)
}
This has updated the debug text to print the current values to the screen and set the text color based on if the value is acceptable or not.
Build and run.
Look straight at the camera and note how the oval is green. Now rotate your head from side to side and note how the oval turns red when you aren't looking directly at the camera. If you have debug mode turned on, notice how the yaw number changes both value and color as well.
Selecting a Size
Next, you want the app to detect how big or small a face is within the frame of the photo. Open CameraViewModel.swift and add the following property under isAcceptableYaw
declaration:
@Published private(set) var isAcceptableBounds: FaceBoundsState {
didSet {
calculateDetectedFaceValidity()
}
}
Then, set the initial value for this property at the bottom of init()
:
isAcceptableBounds = .unknown
As before add the following to the end of invalidateFaceGeometryState()
:
isAcceptableBounds = .unknown
Next, in processUpdatedFaceGeometry()
, add the following to the end of the faceFound
case:
let boundingBox = faceGeometryModel.boundingBox
updateAcceptableBounds(using: boundingBox)
Then fill in the stub of updateAcceptableBounds(using:)
with the following code:
// 1
if boundingBox.width > 1.2 * faceLayoutGuideFrame.width {
isAcceptableBounds = .detectedFaceTooLarge
} else if boundingBox.width * 1.2 < faceLayoutGuideFrame.width {
isAcceptableBounds = .detectedFaceTooSmall
} else {
// 2
if abs(boundingBox.midX - faceLayoutGuideFrame.midX) > 50 {
isAcceptableBounds = .detectedFaceOffCentre
} else if abs(boundingBox.midY - faceLayoutGuideFrame.midY) > 50 {
isAcceptableBounds = .detectedFaceOffCentre
} else {
isAcceptableBounds = .detectedFaceAppropriateSizeAndPosition
}
}
With this code, you:
- First check to see if the bounding box of the face is roughly the same width as the layout guide.
- Then, check if the bounding box of the face is roughly centered in the frame.
If both these checks pass, isAcceptableBounds
is set to FaceBoundsState.detectedFaceAppropriateSizeAndPosition
. Otherwise, it is set to the corresponding error case.
Finally, update calculateDetectedFaceValidity()
to look like this:
hasDetectedValidFace =
isAcceptableBounds == .detectedFaceAppropriateSizeAndPosition &&
isAcceptableRoll &&
isAcceptablePitch &&
isAcceptableYaw
This adds a check that the bounds are acceptable.
Build and run. Move the phone toward and away from your face and note how the oval changes color.
Detecting Differences
Currently, the FaceDetector is detecting face rectangles using VNDetectFaceRectanglesRequestRevision2. iOS 15 introduced a new revision, VNDetectFaceRectanglesRequestRevision3. So what's the difference?
Version 3 provides many useful updates for detecting face rectangles, including:
- The pitch of the detected face is now determined. You may not have noticed, but the value for the pitch so far was always 0 because it wasn't present in the face observation.
- Roll, pitch and yaw values are reported in continuous space. With VNDetectFaceRectanglesRequestRevision2, the roll and yaw was provided within discrete bins only. You can observe this yourself using the app and rolling your head from side to side. The yaw always jumps between 0 and ±0.785 radians.
- When detecting face landmarks, the location of the pupils is accurately detected. Previously, the pupils would be set to the center of the eyes even when looking out to the side of your face.
Time to update the app to use VNDetectFaceRectanglesRequestRevision3. You'll make use of detected pitch and observe the continuous space updates.
Open FaceDetector.swift. In captureOutput(_:didOutput:from:)
, update the revision property of detectFaceRectanglesRequest
to revision 3:
detectFaceRectanglesRequest.revision = VNDetectFaceRectanglesRequestRevision3
Build and run.
Hold your phone up to your face. Note how the values printed in the debug output update on every frame. Pitch your head (look up to the ceiling, and down with your chin on your chest). Note how the pitch numbers also update.
Masking Mayhem
Unless you've been living under a rock, you must have noticed that more and more people are wearing masks. This is great for fighting COVID, but terrible for face recognition!
Luckily, Apple has your back. With VNDetectFaceRectanglesRequestRevision3, the Vision framework can now detect faces covered by masks. While this is nice for general-purpose face detection, it's a disaster for your passport photos app. Wearing a mask is absolutely not allowed in your passport photo! So how then should you prevent people who are wearing masks from taking photos?
Luckily for you, Apple has also improved face capture quality. Face capture quality provides a score for a detected face. It takes into account attributes like lighting, occlusion (like masks!), blur, etc.
Please note that quality detection compares the same subject against copies of themselves. It does not compare one person against another. Capture quality varies between 0 to 1. The latest revision in iOS 15 is VNDetectFaceCaptureQualityRequestRevision2
.