Image Depth Maps Tutorial for iOS: Getting Started
Learn how you can use the incredibly powerful image manipulation frameworks on iOS to use image depth maps with only a few lines of code. By Owen L Brown.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Image Depth Maps Tutorial for iOS: Getting Started
20 mins
- Getting Started
- Reading Depth Data
- Implementing the Depth Data
- How Does the iPhone Do This?
- Depth vs Disparity
- Creating a Mask
- Setting up the Left Side of the Mask
- Setting up the Right Side of the Mask
- Combining the Two Masks
- Your First Depth-Inspired Filter
- Color Highlight Filter
- Change the Focal Length
- More About AVDepthData
- Where to Go From Here?
Let’s be honest. We, the human race, will eventually create robots that will take over the world, right? One thing that will be super important to our future robot masters will be good depth perception. Without it, how will they know if it’s really a human they’ve just imprisoned or just a cardboard cutout of a human? One way they can do this is by using depth maps.
Before robots can use depth maps, however, they need to be programmed to understand them. That’s where you come in! In this tutorial, you’ll learn about the APIs Apple provides for image depth maps. Throughout the tutorial, you’ll:
- Learn how the iPhone generates depth information.
- Read depth data from images.
- Combine this depth data with filters to create neat effects.
So what are you waiting for? Your iPhone wants to start seeing in 3D!
Getting Started
Download the starter project by clicking the Download Materials button at the top or bottom of the tutorial.
Before you begin, you need to run Xcode 11 or later. Running this tutorial on a device directly is highly recommended. To do so, you need an iPhone running iOS 13 or later.
Once that’s done, you can explore the materials for this tutorial. The bundled images include depth information to use with the tutorial.
If you prefer and you have a dual-camera iPhone, you can take your own images to use with this tutorial. To take pictures that include depth data, the iPhone needs to be running iOS 11 or later. Don’t forget to use Portrait mode in the Camera app.
Build and run. You’ll see this:
Tapping on one image cycles to the next. If you add your own pictures, make sure they have the .jpg file extension.
In this tutorial, you’ll fill in the functionality of the Depth, Mask and Filtered segments that you can see right at the bottom of the screen. Feel free to tap on them. They don’t do much right now. They will soon!
Reading Depth Data
The most important class, in the iOS SDK, for depth data is AVDepthData
.
Different image formats store depth data slightly differently. In images in the HEIC format, it’s stored as metadata, but JPGs store it as a second image within the same JPG.
You generally use AVDepthData
to extract this auxiliary data from an image, so that’s the first step you’ll take in this tutorial. Open SampleImage.swift and add the following method at the bottom of SampleImage
:
static func depthDataMap(forItemAt url: URL) -> CVPixelBuffer? {
// 1
guard let source = CGImageSourceCreateWithURL(url as CFURL, nil) else {
return nil
}
// 2
let cfAuxDataInfo = CGImageSourceCopyAuxiliaryDataInfoAtIndex(
source,
0,
kCGImageAuxiliaryDataTypeDisparity
)
guard let auxDataInfo = cfAuxDataInfo as? [AnyHashable : Any] else {
return nil
}
// 3
let cfProperties = CGImageSourceCopyPropertiesAtIndex(source, 0, nil)
guard
let properties = cfProperties as? [CFString: Any],
let orientationValue = properties[kCGImagePropertyOrientation] as? UInt32,
let orientation = CGImagePropertyOrientation(rawValue: orientationValue)
else {
return nil
}
// 4
guard var depthData = try? AVDepthData(
fromDictionaryRepresentation: auxDataInfo
) else {
return nil
}
// 5
if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32 {
depthData = depthData.converting(
toDepthDataType: kCVPixelFormatType_DisparityFloat32
)
}
// 7
return depthData.applyingExifOrientation(orientation).depthDataMap
}
OK, that was quite a bit of code, but here’s what you did:
- First, you create a
CGImageSource
that represents the input file. - From the image source at index 0, you copy the disparity data from its auxiliary data. You’ll learn more about what that means later, but you can think of it as depth data for now. The index is 0 because there’s only one image in the image source.
- The image’s orientation is stored as separate metadata. To correctly align the depth data, you extract this orientation using
CGImageSourceCopyPropertiesAtIndex(_:_:_:)
. Now you can apply it later. - You create an
AVDepthData
from the auxiliary data you read in. - You ensure the depth data is the format you need — 32-bit floating point disparity information — and convert it if it isn’t.
- Finally, you apply the correct orientation and return this depth data map.
Now that you’ve set up the depth data, it’s time to put it to good use!
Implementing the Depth Data
Now before you can run this, you need to update depthData(forItemAt:)
. Replace its implementation with the following:
// 1
guard let depthDataMap = depthDataMap(forItemAt: url) else { return nil }
// 2
depthDataMap.normalize()
// 3
let ciImage = CIImage(cvPixelBuffer: depthDataMap)
return UIImage(ciImage: ciImage)
With this code:
- Using your new
depthDataMap(forItemAt:)
, you read the depth data into aCVPixelBuffer
. - You then normalize the depth data using a provided extension to
CVPixelBuffer
. This makes sure all the pixels are between 0.0 and 1.0, where 0.0 are the farthest pixels and 1.0 are the nearest pixels. - You then convert the depth data to a
CIImage
and then aUIImage
and return it.
normalize()
works, look in CVPixelBufferExtension.swift. It loops through every value in the 2D array and keeps track of the minimum and maximum values seen. It then loops through all the values again and uses the min and max values to calculate a new value that is between 0.0 and 1.0.Build and run and tap the Depth segment of the segmented control at the bottom.
Awesome! This is essentially a visual representation of the depth data. The whiter the pixel, the closer it is, the darker the pixel, the farther away it is. The normalization that you did ensured that the furthest pixel is solid black and the nearest pixel is solid white. Everything else is somewhere in that range of gray.
Great job!
How Does the iPhone Do This?
In a nutshell, the iPhone’s dual cameras are imitating stereoscopic vision.
Try this. Hold your index finger closely in front of your nose and pointing upward. Close your left eye. Without moving your finger or head, simultaneously open your left eye and close your right eye.
Now quickly switch back and forth closing one eye and opening the other. Pay attention to the relative location of your finger to objects in the background. See how your finger seems to make large jumps left and right compared to objects further away?
The closer an object is to your eyes, the larger the change in its relative position compared to the background. Does this sound familiar? It’s a parallax effect!
The iPhone’s dual cameras are like its eyes, looking at two images taken at a slight offset from one another. It corresponds features in the two images and calculates how many pixels they have moved. This change in pixels is called disparity.