Using Google speech API In Your iOS Application

 

1 Star2 Stars3 Stars4 Stars5 Stars (7 votes, average: 4.71 out of 5)
Loading...

 

Pham Van Hoang, hoangk55cd@gmail.com, is the author of this article and he contributes to RobustTechHouse Blog

 

Introduction

Unfortunately, if you want to start an iOS application with speech-recognition , unlike Android which comes with native development kit as supported by Google, there are no official APIs  supported by Apple at this time of writing.

If you have used Google services before, then you will know that the accuracy of Google’s speech recognition service is top notch. It is very accurate and supports online short utterances with no language model or vocabulary configuration. Sadly, there is no official Google Speech API support for iOS available, but there are work-arounds that we can deploy. You should note that this is available for development and personal use only.

Today, I am going to show you how to integrate Google Speech API in your application. In this article you’ll learn:

  • Google Speech API and how to request credentials key from Google.
  • How to integrate Google Speech API in your application.

[Video and Source Code]

Video:  https://www.youtube.com/watch?v=O_i54tj7jv8

Source code: SpeechAPIExample

[Google Speech API]

Host: https://www.google.com/speech-api/v2/recognize

Method: Post

Input:

lang: any valid locale (en-us, nl-be, fr-fr, etc.)

key: credentials key from Google. You’ll see you to get key bellow.

app: optional

output: json

Data:

FLAC

16-bit PCM

Headers: Content-Type (Ex: Content-Type: audio/x-flac; rate=44100;).  Make sure the rate in your header matches the sample rate you used for your audio capture.

You can find more about this API here from gillesdemey

 

Request credentials key from Google

  • First, make sure you are a member of chromium-dev@chromium.org . If not you can just subscribe to be chromium-dev and choose not to receive mail. The APIs are only visible to people subscribed to that group. gspeech_01
  • Make sure you are logged in with the Google account associated with the email address that you used to subscribe to chromium-dev.
  • Go to https://cloud.google.com/console
  • Click the blue Create Project button. And create your own project.
  • In a search box: Search for “Speech API” and enable the API. gspeech_02
  • Transfer to Credentials Key Screen , choose iOS platform and  add credentials to your own project

Now you have the credentials key of your own.

 

Integrating Google Speech API in your application

Now that you have all information needed to use the API, you just need to record the speech and send to the services through the API.

You can do your own class to record and handle the response. If you find that it take too much of your  time, you can see an example from this repository of mzeeshanid. However, this repository was deprecated and has some classes that are no longer needed. I have modified his repository just using SpeechToTextModule class. You can see the version I modified here.

Now, it’s time to integrate the module into your project.

Step1: You need to add SpeechToTextModule class and speex SDK to your project. gspeech_03

Step 2: Because this class is using non-arc, so make sure you mark the flag “-fno-objc-arc” in the header file of class SpeechToTextModule. gspeech_04 

Step 3: Replace your credentials key on GOOGLE_SPEECH_TO_TEXT_KEY line in SpeechToTextModule.m file. gspeech_05

Step 4: Import SpeechToTextModule  class and SpeechToTextModuleDelegate and create an instance. gspeech_06

Step 5: Create UI, in this project I just use a button to record/stop. And a background image behind the button to make an animation when recording the user speech (I have used an UIImage Category to display gif file). gspeech_07

 

Step 6: Handle record/stop action. When users tap button record, you need to start recording and also change the background button, start animating to notice user that your app is recording the speech. gspeech_08

Step 7: When user taps button again you need to stop recording and change the button background. The SpeechToTextModule class will send the data to Google server. gspeech_09

Step 8: Handle the data response in SpeechToTextModuleDelegate – – (BOOL)didReceiveVoiceResponse:(NSDictionary *)data by your own purposes. gspeech_10

You can see more in my example here: SpeechAPIExample. Hope you will find this post useful. If you have any questions, please leave the comments below. Thanks for reading.

 

References

https://github.com/mzeeshanid/iOS-Speech-To-Text

https://github.com/gillesdemey/google-speech-v2

 

Brought to you by the RobustTechHouse team (A top app development company in Singapore).  If you like our articles, please also check out our Facebook page.

Recommended Posts
Contact Us

We look forward to your messages. Please drop us a note for any enquiries and we'll get back to you, asap.

Not readable? Change text. captcha txt
Top Fintech Predictions 2016Android App Programming