How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7
Learn how to make Siri read you a bedtime story to you by using one of iOS 7’s newest features: AVSpeechSynthesizer. By .
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
How to Make a Narrated Book Using AVSpeechSynthesizer in iOS 7
30 mins
Be a Good Delegate and Listen
Your speech synthesizer AVSpeechSynthesizer
has a delegate AVSpeechSynthesizerDelegate
that is informed of various important events and actions in the speech synthesizer’s lifecycle. You’ll implement some of these delegate methods to make speech sound more natural by using the utterance properties included in WhirlySquirrelly.plist
.
Open RWTPage.h and add the following code after the declaration of displayText
@property (nonatomic, strong, readonly) NSArray *utterances;
Open RWTPage.m and add the following code after the declaration of displayText
@property (nonatomic, strong, readwrite) NSArray *utterances;
Note: You’re following a best-practice here by declaring properties in the header file as readonly
and in the implementation file as readwrite
. This makes sure that only object itself that can set its properties.
Note: You’re following a best-practice here by declaring properties in the header file as readonly
and in the implementation file as readwrite
. This makes sure that only object itself that can set its properties.
Replace pageWithAttribute:
with the following code
+ (instancetype)pageWithAttributes:(NSDictionary*)attributes
{
RWTPage *page = [[RWTPage alloc] init];
if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSString class]]) {
page.displayText = [attributes objectForKey:RWTPageAttributesKeyUtterances];
page.backgroundImage = [attributes objectForKey:RWTPageAttributesKeyBackgroundImage];
// 1
page.utterances = @[[[AVSpeechUtterance alloc] initWithString:page.displayText]];
} else if ([[attributes objectForKey:RWTPageAttributesKeyUtterances] isKindOfClass:[NSArray class]]) {
NSMutableArray *utterances = [NSMutableArray arrayWithCapacity:31];
NSMutableString *displayText = [NSMutableString stringWithCapacity:101];
for (NSDictionary *utteranceAttributes in [attributes objectForKey:RWTPageAttributesKeyUtterances]) {
NSString *utteranceString =
[utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceString];
NSDictionary *utteranceProperties =
[utteranceAttributes objectForKey:RWTUtteranceAttributesKeyUtteranceProperties];
AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:utteranceString];
[utterance setValuesForKeysWithDictionary:utteranceProperties];
if (utterance) {
[utterances addObject:utterance];
[displayText appendString:utteranceString];
}
}
page.displayText = displayText;
page.backgroundImage = [UIImage imageNamed:[attributes objectForKey:RWTPageAttributesKeyBackgroundImage]];
// 2
page.utterances = [utterances copy];
}
return page;
}
The only new code is in sections 1 and 2, which set the page.utterances
property for the NSString
case and the same property for the NSArray
case, respectively.
Open RWTPageViewController.h and replace its contents below the header comments with
#import <UIKit/UIKit.h>
@import AVFoundation;
// 1
@interface RWTPageViewController : UIViewController<AVSpeechSynthesizerDelegate>
@property (nonatomic, weak) IBOutlet UILabel *pageTextLabel;
@property (nonatomic, weak) IBOutlet UIImageView *pageImageView;
@end
In Section 1, you declared that RWTPageViewController
conforms to the AVSpeechSynthesizerDelegate
protocol.
Open RWTPageViewController.m and add the following property declaration just below the declaration of the synthesizer
property
@property (nonatomic, assign) NSUInteger nextSpeechIndex;
You’ll use this new property to track which element of RWTPage.utterances
to speak next.
Replace setupForCurrentPage
with
- (void)setupForCurrentPage
{
self.pageTextLabel.text = [self currentPage].displayText;
self.pageImageView.image = [self currentPage].backgroundImage;
self.nextSpeechIndex = 0;
}
Replace speakNextUtterance
with
- (void)speakNextUtterance
{
// 1
if (self.nextSpeechIndex < [[self currentPage].utterances count]) {
// 2
AVSpeechUtterance *utterance = [[self currentPage].utterances objectAtIndex:self.nextSpeechIndex];
self.nextSpeechIndex += 1;
// 3
[self.synthesizer speakUtterance:utterance];
}
}
- In Section 1, you're ensuring that
nextSpeechUtterance
is in range. - At Section 2 you're getting the current utterance and advancing the index.
- Finally, in Section 3, you're speaking the utterance.
Build and run.What happens now? You should only hear "Whisky," the first word, spoken on each page. That's because you still need to implement some AVSpeechSynthesizerDelegate
methods to queue up the next utterance for speech when the synthesizer finishes speaking the current utterance.
Replace startSpeaking
with
- (void)startSpeaking
{
if (!self.synthesizer) {
self.synthesizer = [[AVSpeechSynthesizer alloc] init];
// 1
self.synthesizer.delegate = self;
}
[self speakNextUtterance];
}
In Section 1, you've made your view controller a delegate of your synthesizer.
Add the following code at the end of RWTPageViewController.m
, just before the @end
#pragma mark - AVSpeechSynthesizerDelegate Protocol
- (void)speechSynthesizer:(AVSpeechSynthesizer*)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance*)utterance
{
NSUInteger indexOfUtterance = [[self currentPage].utterances indexOfObject:utterance];
if (indexOfUtterance == NSNotFound) {
return;
}
[self speakNextUtterance];
}
Your new code queues up the next utterance when the synthesizer finishes speaking the current utterance.
Build and run. You'll now hear a couple of differences:
- You queue up the next utterance when the current utterance is spoken, so that every word on a page is verbalize.
- When you swipe to the next or previous page, the current page's text is no longer spoken.
- Speech sounds much more natural, thanks to the
utteranceProperties
in Supporting Files\WhirlySquirrelly.plist. Your humble tutorial author toiled over these to hand-tune the speech.
Control: You Must Learn Control
Master Yoda was wise: control is important. Now that your book speaks each utterance individually, you're going to add buttons to your UI so you can make real time adjustments to the pitch and rate of your synthesizer's speech.
Still in RWTPageViewController.m, add the following property declarations right after the declaration of the nextSpeechIndex
property
@property (nonatomic, assign) float currentPitchMultiplier;
@property (nonatomic, assign) float currentRate;
To set these new properties, add the following methods right after the body of gotoPreviousPage:
- (void)lowerPitch
{
if (self.currentPitchMultiplier > 0.5f) {
self.currentPitchMultiplier = MAX(self.currentPitchMultiplier * 0.8f, 0.5f);
}
}
- (void)raisePitch
{
if (self.currentPitchMultiplier < 2.0f) {
self.currentPitchMultiplier = MIN(self.currentPitchMultiplier * 1.2f, 2.0f);
}
}
- (void)lowerRate
{
if (self.currentRate > AVSpeechUtteranceMinimumSpeechRate) {
self.currentRate = MAX(self.currentRate * 0.8f, AVSpeechUtteranceMinimumSpeechRate);
}
}
- (void)raiseRate
{
if (self.currentRate < AVSpeechUtteranceMaximumSpeechRate) {
self.currentRate = MIN(self.currentRate * 1.2f, AVSpeechUtteranceMaximumSpeechRate);
}
}
-(void) speakAgain
{
if (self.nextSpeechIndex == [[self currentPage].utterances count]) {
self.nextSpeechIndex = 0;
[self speakNextUtterance];
}
}
These methods are the actions that connect to your speech control buttons.
-
lowerPitch:
andraisePitch:
lower and raise the speech pitch, respectively, by up to 20% for each invocation, within the range[0.5f, 2.0f]
. -
lowerRate:
andraiseRate"
lower and raise the speech rate, respectively, by up to 20% for each invocation, within the range[AVSpeechUtteranceMinimumSpeechRate, AVSpeechUtteranceMaximumSpeechRate]
. -
speakAgain:
resets the internal index of the current spoken word, then repeats the message on the screen.
Create the buttons by adding the following methods right after the body of raiseRate
-(void) addSpeechControlWithFrame: (CGRect) frame title:(NSString *) title action:(SEL) selector {
UIButton *controlButton = [UIButton buttonWithType:UIButtonTypeRoundedRect];
controlButton.frame = frame;
controlButton.backgroundColor = [UIColor colorWithWhite:0.9f alpha:1.0f];
[controlButton setTitle:title forState:UIControlStateNormal];
[controlButton addTarget:self
action:selector
forControlEvents:UIControlEventTouchUpInside];
[self.view addSubview:controlButton];
}
- (void)addSpeechControls
{
[self addSpeechControlWithFrame:CGRectMake(52, 485, 150, 50)
title:@"Lower Pitch"
action:@selector(lowerPitch)];
[self addSpeechControlWithFrame:CGRectMake(222, 485, 150, 50)
title:@"Raise Pitch"
action:@selector(raisePitch)];
[self addSpeechControlWithFrame:CGRectMake(422, 485, 150, 50)
title:@"Lower Rate"
action:@selector(lowerRate)];
[self addSpeechControlWithFrame:CGRectMake(592, 485, 150, 50)
title:@"Raise Rate"
action:@selector(raiseRate)];
[self addSpeechControlWithFrame:CGRectMake(506, 555, 150, 50)
title:@"Speak Again"
action:@selector(speakAgain)];
}
addSpeechControlWithFrame:
is a convenience method to add buttons to the view that links each of them with methods to alter the spoken text on demand.
Note: You could also create these buttons in Main.storyboard
and wire up their actions in RWTPageViewController
. But that would be too easy, and there is a more functional approach.
Note: You could also create these buttons in Main.storyboard
and wire up their actions in RWTPageViewController
. But that would be too easy, and there is a more functional approach.
Add the following code in viewDidLoad
before [self startSpeaking]
:
// 1
self.currentPitchMultiplier = 1.0f;
self.currentRate = AVSpeechUtteranceDefaultSpeechRate;
// 2
[self addSpeechControls];
Section 1 sets your new speech properties to default values, and section 2 adds your speech controls.
As the last step, replace speakNextUtterance
with the following
- (void)speakNextUtterance
{
if (self.nextSpeechIndex < [[self currentPage].utterances count]) {
AVSpeechUtterance *utterance = [[self currentPage].utterances objectAtIndex:self.nextSpeechIndex];
self.nextSpeechIndex += 1;
// 1
utterance.pitchMultiplier = self.currentPitchMultiplier;
// 2
utterance.rate = self.currentRate;
[self.synthesizer speakUtterance:utterance];
}
}
The new code sets the pitchMultiplier
and rate
of the next utterance to the values you set while clicking the nifty new lower/raise buttons.
Build and run. You should see something like below.
Try clicking or tapping the various buttons while it's speaking, and take note of how it changes the sound of the speech. Yoda would be proud; you're not a Jedi yet, but are becoming a master of AVSpeechSynthesizer
.