Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Vonage call transcription: adding real-time speech-to-text to Vonage
TL;DR: Integrating our speech-to-text infrastructure with the Vonage Voice API replaces fragmented recording, transcription, and enrichment stacks with a single API. By routing Vonage WebSocket streams directly to our endpoint, contact centers achieve approximately 270ms real-time latency for live agent assistance, or use post-call batch processing for automated QA scoring. Streaming is the right choice for live superviso. Async is the right choice when speaker-attributed QA scoring and full call context matter more than latency.
Key data extraction: accurately extracting names, account numbers, and intents from calls
TL;DR: Downstream contact center automation fails silently when the transcription layer misinterprets a name, transposes a digit, or attributes speech to the wrong speaker. Every QA scorecard, CRM entry, and coaching signal is ceiling-bounded by the accuracy of the layer beneath it. A wrong digit or phonetic name substitution propagates into every CRM field and compliance event that follows. Extraction precision is capped by transcription quality: Solaria-1 delivers on average 29% lower WER on conversational speech and 3x lower DER than alternatives, benchmarked across 8 providers, 7 datasets, and 74+ hours of audio.
Amazon Connect transcription: real-time speech-to-text for AWS contact centers
TL;DR: Contact centers using Amazon Connect struggle with high transcription costs and poor multilingual accuracy when relying on native tools. Routing audio via Kinesis Video Streams or S3 to Solaria-1 eliminates the Lambda 15-minute timeout risk and removes per-feature add-on costs. On conversational speech, Solaria-1 delivers on average 29% lower WER than alternatives, benchmarked across 7 datasets and 74+ hours of audio.
How to integrate live transcription API with Twilio to transcribe calls in real time
Published on Sep 28, 2023
Twilio, used by hundreds of thousands of businesses and more than ten million developers worldwide, can now integrate with our live transcription API. The integration makes it easier for users to natively transcribe any phone call in real time while using Twilio. With transcribed text at your disposal, you'll then be able to analyze, archive, and act upon voice data more effectively.
Below, you’ll find a step-by-step guide on setting up the Twilio integration with Gladia API in JavaScript for free.
What can you do with Twilio integration?
Any developer can use this integration to transcribe phone calls in real-time.
How to implement Twilio + Gladia real-time transcription integration
Step 1: Set up your Gladia account
If you haven't already, sign up for our Speech-to-Text API at app.gladia.io and obtain your API key.
Step 2: Create and parametrize your Twilio account
Get a phone number, following the first step of the main page to connect to your Twilio account.
On the left panel Develop > United States (US1) > Phone Numbers > Manage > Active numbers.
Click on the phone number you just created.
In 'Configure' panel, 'Voice Configuration' section, 'A call comes in' field, choose 'Webhook' with URL = 'http://[your-id-address]:[your-app-port-number]' and HTTP = 'HTTP POST'
Step 3: Configure your server and install dependencies
In .env file, add GLADIA_API_KEY var with your API key obtained from Gladia’s website and PORT var, the port you used to configure your phone number in above section (default is 8080)
Feel free to check out the video version of the tutorial for a step-by-step walkthrough with one of our software engineers, Antoine.
We hope you enjoyed this how-to tutorial! Given how much audio data still goes to wasted, we’re always curious to explore the many ways in which transcription tech can be used to remedy that. Let us know if you end up using our API with Twilio, Discord, or other, we’d love to hear from you.
About Gladia
At Gladia, we built an optimized version of Whisper in the form of an API, adapted to real-life professional use cases and distinguished by exceptional accuracy, speed, extended multilingual capabilities and state-of-the-art features, including speaker diarization and word-level timestamps.
Contact us
Your request has been registered
A problem occurred while submitting the form.
Read more
Speech-To-Text
Vonage call transcription: adding real-time speech-to-text to Vonage
Speech-To-Text
Key data extraction: accurately extracting names, account numbers, and intents from calls
Speech-To-Text
Amazon Connect transcription: real-time speech-to-text for AWS contact centers
From audio to knowledge
Subscribe to receive latest news, product updates and curated AI content.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.