Using Google speech API In Your iOS Application
Unfortunately, if you want to start an iOS application with speech-recognition , unlike Android which comes with native development kit as supported by Google, there are no official APIs supported by Apple at this time of writing.
If you have used Google services before, then you will know that the accuracy of Google’s speech recognition service is top notch. It is very accurate and supports online short utterances with no language model or vocabulary configuration. Sadly, there is no official Google Speech API support for iOS available, but there are work-arounds that we can deploy. You should note that this is available for development and personal use only.
Today, I am going to show you how to integrate Google Speech API in your application. In this article you’ll learn:
- Google Speech API and how to request credentials key from Google.
- How to integrate Google Speech API in your application.
[Video and Source Code]
Source code: SpeechAPIExample
[Google Speech API]
lang: any valid locale (en-us, nl-be, fr-fr, etc.)
key: credentials key from Google. You’ll see you to get key bellow.
Headers: Content-Type (Ex: Content-Type: audio/x-flac; rate=44100;). Make sure the rate in your header matches the sample rate you used for your audio capture.
Request credentials key from Google
- First, make sure you are a member of firstname.lastname@example.org . If not you can just subscribe to be chromium-dev and choose not to receive mail. The APIs are only visible to people subscribed to that group.
- Make sure you are logged in with the Google account associated with the email address that you used to subscribe to chromium-dev.
- Go to https://cloud.google.com/console
- Click the blue Create Project button. And create your own project.
- In a search box: Search for “Speech API” and enable the API.
- Transfer to Credentials Key Screen , choose iOS platform and add credentials to your own project
Now you have the credentials key of your own.
Integrating Google Speech API in your application
Now that you have all information needed to use the API, you just need to record the speech and send to the services through the API.
You can do your own class to record and handle the response. If you find that it take too much of your time, you can see an example from this repository of mzeeshanid. However, this repository was deprecated and has some classes that are no longer needed. I have modified his repository just using SpeechToTextModule class. You can see the version I modified here.
Now, it’s time to integrate the module into your project.
Step 5: Create UI, in this project I just use a button to record/stop. And a background image behind the button to make an animation when recording the user speech (I have used an UIImage Category to display gif file).
Step 6: Handle record/stop action. When users tap button record, you need to start recording and also change the background button, start animating to notice user that your app is recording the speech.
You can see more in my example here: SpeechAPIExample. Hope you will find this post useful. If you have any questions, please leave the comments below. Thanks for reading.