Tesseract OCR Tutorial for iOS
In this tutorial, you’ll learn how to read and manipulate text extracted from images using OCR by Tesseract. By Lyndsey Scott.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Tesseract OCR Tutorial for iOS
25 mins
- Getting Started
- Tesseract’s Limitations
- Adding the Tesseract Framework
- How Tesseract OCR Works
- Adding Trained Data
- Loading the Image
- Implementing Tesseract OCR
- Processing Your First Image
- Scaling Images While Preserving Aspect Ratio
- Improving OCR Accuracy
- Improving Image Quality
- Where to Go From Here?
Loading the Image
First, you’ll create a way to access images from the device’s camera or photo library.
Open ViewController.swift and insert the following into takePhoto(_:)
:
// 1
let imagePickerActionSheet =
UIAlertController(title: "Snap/Upload Image",
message: nil,
preferredStyle: .actionSheet)
// 2
if UIImagePickerController.isSourceTypeAvailable(.camera) {
let cameraButton = UIAlertAction(
title: "Take Photo",
style: .default) { (alert) -> Void in
// TODO: Add more code here...
}
imagePickerActionSheet.addAction(cameraButton)
}
// 3
let libraryButton = UIAlertAction(
title: "Choose Existing",
style: .default) { (alert) -> Void in
// TODO: Add more code here...
}
imagePickerActionSheet.addAction(libraryButton)
// 4
let cancelButton = UIAlertAction(title: "Cancel", style: .cancel)
imagePickerActionSheet.addAction(cancelButton)
// 5
present(imagePickerActionSheet, animated: true)
Here, you:
- Create an action sheet alert that will appear at the bottom of the screen.
- If the device has a camera, add a Take Photo button to the action sheet.
- Add a Choose Existing button to the action sheet.
- Add a Cancel button to action sheet. Selecting this button removes the alert without performing an action since it’s of type
.cancel
. - Finally, present the alert.
Immediately below import UIKit
add:
import MobileCoreServices
This gives ViewController
access to the kUTTypeImage
abstract image identifier, which you’ll use to limit the image picker’s media type.
Now within the cameraButton
UIAlertAction
’s closure, replace the // TODO
comment with:
// 1
self.activityIndicator.startAnimating()
// 2
let imagePicker = UIImagePickerController()
// 3
imagePicker.delegate = self
// 4
imagePicker.sourceType = .camera
// 5
imagePicker.mediaTypes = [kUTTypeImage as String]
// 6
self.present(imagePicker, animated: true, completion: {
// 7
self.activityIndicator.stopAnimating()
})
So when the user taps cameraButton
, this code:
- Reveals the view controller’s activity indicator.
- Creates an image picker.
- Assigns the current view controller as that image picker’s delegate.
- Tells the image picker to present as a camera interface to the user.
- Limits the image picker’s media type so the user can only capture still images.
- Displays the image picker.
- Hides the activity indicator once the image picker finishes animating into view.
Similarly, within libraryButton
’s closure, add:
self.activityIndicator.startAnimating()
let imagePicker = UIImagePickerController()
imagePicker.delegate = self
imagePicker.sourceType = .photoLibrary
imagePicker.mediaTypes = [kUTTypeImage as String]
self.present(imagePicker, animated: true, completion: {
self.activityIndicator.stopAnimating()
})
This is identical to the code you just added to cameraButton
’s closure aside from imagePicker.sourceType = .photoLibrary
. Here, you set the image picker to present the device’s photo library as opposed to the camera.
Next, to process the captured or selected image, insert the following into imagePickerController(_:didFinishPickingMediaWithInfo:)
:
// 1
guard let selectedPhoto =
info[.originalImage] as? UIImage else {
dismiss(animated: true)
return
}
// 2
activityIndicator.startAnimating()
// 3
dismiss(animated: true) {
self.performImageRecognition(selectedPhoto)
}
Here, you:
- Check to see whether
info
’s.originalImage
key contains an image value. If it doesn’t, the image picker removes itself from view and the rest of the method doesn’t execute. - If
info
’s.originalImage
does in fact contain an image, display an activity indicator while Tesseract does its work. - After the image picker animates out of view, pass the image into
performImageRecognition
.
You’ll code performImageRecognition
in the next section of the tutorial, but, for now, just open Info.plist. Hover your cursor over the top cell, Information Property List, then click the + button twice when it appears.
In the Key fields of those two new entries, add Privacy – Camera Usage Description to one and Privacy – Photo Library Usage Description to the other. Select type String for each. Then in the Value column, enter whatever text you’d like to display to the user when requesting permission to access their camera and photo library, respectively.
Build and run your project. Tap the Snap/Upload Image button and you should see the UIAlertController
you just created.
Test out the action sheet options and grant the app access to your camera and/or library when prompted. Confirm the photo library and camera display as expected.
All good? If so, it’s finally time to use Tesseract!
Implementing Tesseract OCR
First, add the following below import MobileCoreServices
to make the Tesseract framework available to ViewController
:
import TesseractOCR
Now, in performImageRecognition(_:)
, replace the // TODO
comment with the following:
// 1
if let tesseract = G8Tesseract(language: "eng+fra") {
// 2
tesseract.engineMode = .tesseractCubeCombined
// 3
tesseract.pageSegmentationMode = .auto
// 4
tesseract.image = image
// 5
tesseract.recognize()
// 6
textView.text = tesseract.recognizedText
}
// 7
activityIndicator.stopAnimating()
Since this is the meat of this tutorial, here’s a detailed break down, line by line:
- Initialize
tesseract
with a new G8Tesseract object that will use both English (“eng”)- and French (“fra”)-trained language data. Note that the poem’s French accented characters aren’t in the English character set, so it’s necessary to include the French-trained data in order for those accents to appear. - Tesseract offers three different OCR engine modes:
.tesseractOnly
, which is the fastest, but least accurate method;.cubeOnly
, which is slower but more accurate since it employs more artificial intelligence; and.tesseractCubeCombined
, which runs both.tesseractOnly
and.cubeOnly
..tesseractCubeCombined
is the slowest, but since it’s most accurate, you’ll use it in this tutorial. - Tesseract assumes, by default, that it’s processing a uniform block of text, but your sample image has multiple paragraphs. Tesseract’s
pageSegmentationMode
lets the Tesseract engine know how the text is divided. In this case, setpageSegmentationMode
to.auto
to allow for fully automatic page segmentation and thus the ability to recognize paragraph breaks. - Assign the selected image to the
tesseract
instance. - Tell Tesseract to get to work recognizing your text.
- Put Tesseract’s recognized text output into your
textView
. - Hide the activity indicator since the OCR is complete.
Now, it’s time to test out this first batch of new code!
Processing Your First Image
In Finder, navigate to Love In A Snap/Resources/Lenore.png to find the sample image.
Lenore.png is an image of a love poem addressed to a “Lenore,” but with a few edits you can turn it into a poem that is sure to get the attention of the one you desire! :]
Although you could print a copy of the image, then snap a picture with the app to perform the OCR, you’ll make it easy on yourself and add the image directly to your device’s camera roll. This eliminates the potential for human error, further lighting inconsistencies, skewed text and flawed printing among other things. After all, the image is already dark and blurry as is.
Build and run your app. Tap Snap/Upload Image, tap Choose Existing, then choose the sample image from the photo library to run it through OCR.
Uh oh! Nothing appears! That’s because the current image size is too big for Tesseract to handle. Time to change that!