Google has released an overhaul of its Cloud Speech-to-Text API, designed to make the technology more business friendly.
According to a new entry on Google’s Cloud Platform Blog, Google Cloud Speech-to-Text now supports a selection of pre-built models, automatic punctuation, recognition metadata, and standard service level agreement (SLA). The new API promises a reduction in word errors around 54 percent across all of Google’s tests, but in some areas the results are actually far better than that.
“Access to quality speech transcription technology opens up a world of possibilities for companies that want to connect with and learn from their users,” writes Google product manager Dan Aharon. The update takes advantage of Google’s latest research around machine learning technology, he said.
According to the blog post, Google’s Cloud Speech-to-Text APU now supports:
- A selection of pre-built models for improved transcription accuracy from phone calls and video
- Automatic punctuation, to improve readability of transcribed long-form audio
- A new mechanism (recognition metadata) to tag and group your transcription workloads, and provide feedback to the Google team
- A standard service level agreement (SLA) with a commitment to 99.9% availability
The company introduced the Google Cloud Speech API in May 2016, and in 2017 the company added several new features including word-level timestamps and support for long-form audio files up to three hours long.
Google said Cloud Speech-to-Text is available now priced at $0.006 USD per 15 seconds for all models, except for the video model, which is twice as expensive at $0.012 USD per 15 seconds.