Capturing Text From Camera Using SwiftUI
Learn how to capture text from the iPhone camera into your SwiftUI app so your users can enter data more quickly and easily. By Audrey Tam.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Capturing Text From Camera Using SwiftUI
25 mins
- iOS 15 Live Text
- Live Text in Photos
- Getting Started
- Build and Run on Your Device
- It’s Magic
- Filtering Text Content Types
- Filtering Date Text
- Display a Camera Button
- Magic Method
- UIViewRepresentable
- Coordinator
- Setting the Button’s Action
- Adding ScanButton to AddPersonView
- Scan Text Into Title
- Button Menu
- Where To Go From Here?
- Apple Resources
It’s Magic
Now look at the code in AddPersonView.swift. There is absolutely nothing in the code about scanning text from the camera. This feature is part of iOS 15, and you get it free in any editable view.
So what’s in the rest of this article? A couple of features to improve the user experience:
- Filtering camera input for specific text content types like dates, phone numbers and email addresses.
- Displaying a Scan Text button to make the camera input feature visible to your users.
You can also implement a Scan Text button to create an editable view that isn’t a text field or text view, like the image-with-label example in the WWDC presentation.
Filtering Text Content Types
You’re probably a bit underwhelmed by this scanning, tapping and swiping procedure. If your app is looking for a specific format of information — URL, date, email or phone number — you want the camera to “see” only the relevant text and ignore the rest.
Your apps might already specify keyboard type to make it more convenient for users to enter numbers or email addresses. Maybe you also specify text content type to guide the keyboard’s suggestions and autofill.
Good news: You can use text content type to filter camera text input!
Filtering Date Text
Start by adding this modifier to the second (Birthday) TextField
in AddPersonView.swift:
.textContentType(.dateTime)
This tells the system you expect the input text to be some form of date, time or duration. The Neural Engine’s Vision model will use this hint to filter the camera’s input for date or time text.
There are several text content types related to a person’s name, so why don’t you modify the Name text field? Well, for now, camera input only works with a few text content types.
Of all the text content types in the Attributes inspector Text Content Type menu, the camera currently filters for only fullStreetAddress
, telephoneNumber
, emailAddress
, URL
, shipmentTrackingNumber
, flightNumber
and dateTime
.
OK, time to see if your modifier helps.
Build and run on your device and tap the + button. In the Add Person view, tap the Birthday text field then tap it again:
Now the camera only highlights text that relates to dates or times:
As before, any detected date or time text appears immediately in the text field. You must still tap Insert to accept the text.
What a great way to speed up text input from the camera!
dateTime
to emailAddress
, the camera will focus only on email addresses.
Display a Camera Button
Everything so far is all built into iOS 15. But you can add the relevant code to your apps, too.
For example, to make the camera input feature more visible to your users, you can add a button that sets the whole magical process in motion. Once you know how the magic happens, you can use it to scan text from the camera into views that aren’t text fields or text views.
Magic Method
The new method captureTextFromCamera(responder:identifier:)
is the key to the magic, which starts when your app calls this method to launch the camera. The responder
must conform to UIResponder
and UIKeyInput
. A responder uses UIKeyInput
methods to implement simple text entry.
Uh oh, UI
prefixes … Yes indeed, captureTextFromCamera(responder:identifier:)
is a UIAction
, so you need a UIView
to call it. You’ll create a UIButton
that AddPersonView
can display. You’ll set the action of this button to captureTextFromCamera(responder:identifier:)
. And the action’s responder
will pass any text captured from the camera to a TextField
in AddPersonView
.
UIViewRepresentable
To create a UIView
you can use in a SwiftUI app, you need to build a structure that conforms to UIViewRepresentable
.
First, create a new Swift file and name it ScanButton. In this new file, replace import Foundation
with the following code:
import SwiftUI
struct ScanButton: UIViewRepresentable {
func makeUIView(context: Context) -> UIButton {
let button = UIButton()
return button
}
func updateUIView(_ uiView: UIButton, context: Context) { }
}
To conform to UIViewRepresentable
, ScanButton
must implement makeUIView(context:)
and updateUIView(_:context:)
.
In this minimal form, makeUIView(context:)
simply creates a UIButton
. AddPersonView
won’t update the button, so updateUIView(_:context:)
is empty.
Coordinator
Tapping the button can capture text that ScanButton
must pass to AddPersonView
. To transfer data from a UIView
to a SwiftUI View
, ScanButton
needs a coordinator.
Add this code inside ScanButton
:
func makeCoordinator() -> Coordinator {
Coordinator(self)
}
class Coordinator: UIResponder, UIKeyInput {
let parent: ScanButton
init(_ parent: ScanButton) { self.parent = parent }
var hasText = false
func insertText(_ text: String) { }
func deleteBackward() { }
}
ScanButton
calls makeCoordinator()
before makeUIView(context:)
and stores the Coordinator
object in context.coordinator
.
The action captureTextFromCamera(responder:identifier:)
needs a UIResponder
argument that conforms to UIKeyInput
, so you make Coordinator
a subclass of UIResponder
and add the UIKeyInput
protocol. Implementing this protocol will enable the coordinator to control text input.
UIKeyInput
requires you to provide hasText
, insertText(_:)
and deleteBackward()
. You want camera input and not keyboard input, so you only have to implement insertText(_:)
to handle the camera input. The value of hasText
doesn’t matter, so set it to false
. And deleteBackward()
doesn’t need to do anything.
The purpose of Coordinator
is to pass text from the camera back to the SwiftUI view that calls ScanButton
, so ScanButton
needs a binding to a String
property in the SwiftUI view.
Add this property at the top of ScanButton
:
@Binding var text: String
AddPersonView
will pass either $name
or $birthday
to ScanButton
.
Now you can finish setting up Coordinator
. Add this line to insertText
:
parent.text = text
Yes, this really is all Coordinator
needs to do!