Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

Text link

Bold text

Emphasis

Superscript

Subscript

Read more

Speech-To-Text

How contact center AI improves efficiency: benchmarks and ROI

TL;DR: Manual QA teams review 1–5% of contact center calls; AI-powered platforms can score all of them, but only when the underlying transcript is accurate. WER and DER are the hidden bottlenecks: a wrong name, missed compliance phrase, or misattributed speaker corrupts every downstream system that reads the transcript, from routing and agent assist to post-call summaries and QA scoring. Our Solaria-1 model delivers on average 29% lower WER than alternatives on conversational speech and on average 3x lower DER (diarization error rate), covers 100+ languages including 42 that no other STT API supports, and handles the full audio pipeline (record, transcribe, enrich) in a single API.

Speech-To-Text

How to integrate AI into contact center performance monitoring

TL;DR: Most contact centers manually review only a small fraction of calls, leaving compliance breaches and coaching signals undetected. Scaling to 100% AI QA coverage means choosing between three integration patterns (CCaaS-native tools, add-on API layers, or a custom build), each determined by how well your speech infrastructure handles noisy, multilingual audio. For post-call monitoring, async batch transcription outperforms real-time on accuracy, diarization quality, and cost predictability at scale. The bottleneck is getting a reliable transcript from noisy call center audio, which is where Solaria-1 and all-inclusive per-hour pricing matter most.

Speech-To-Text

AI solutions for call centers without human translators

TL;DR: At an illustrative fully loaded offshore rate of $6–$15/hr, replacing BPO translation at 10,000 hours/month with Gladia's Growth plan brings the estimated cost from $80,000–$150,000 down to approximately $2,000/month, with diarization, translation, NER, and sentiment included at the base rate. Every downstream output is ceiling-bounded by STT accuracy: a single transcription error produces a wrong translation, a wrong CRM entry, and a wrong coaching score. Native code-switching support is the bottleneck most teams discover only in production. Solaria-1 covers 100+ languages, including 42 not available on any other STT API, with mid-conversation code-switching built in from day one.

How to integrate live transcription API with Twilio to transcribe calls in real time

Published on Sep 28, 2023
How to integrate live transcription API with Twilio to transcribe calls in real time

Twilio, used by hundreds of thousands of businesses and more than ten million developers worldwide, can now integrate with our live transcription API. The integration makes it easier for users to natively transcribe any phone call in real time while using Twilio. With transcribed text at your disposal, you'll then be able to analyze, archive, and act upon voice data more effectively.

Below, you’ll find a step-by-step guide on setting up the Twilio integration with Gladia API in JavaScript for free.

What can you do with Twilio integration?

Any developer can use this integration to transcribe phone calls in real-time. 

How to implement Twilio + Gladia real-time transcription integration

Step 1: Set up your Gladia account

If you haven't already, sign up for our Speech-to-Text API at app.gladia.io and obtain your API key.

Step 2: Create and parametrize your Twilio account

  • Create an account on https://www.twilio.com/try-twilio
  • Get a phone number, following the first step of the main page to connect to your Twilio account.
  • On the left panel Develop > United States (US1) > Phone Numbers > Manage > Active numbers.
  • Click on the phone number you just created.
  • In 'Configure' panel, 'Voice Configuration' section, 'A call comes in' field, choose 'Webhook' with URL = 'http://[your-id-address]:[your-app-port-number]' and HTTP = 'HTTP POST'

Step 3: Configure your server and install dependencies

  • In .env file, add GLADIA_API_KEY var with your API key obtained from Gladia’s website and PORT var, the port you used to configure your phone number in above section (default is 8080)
  • Install dependencies:

npm i

Step 4: Make it work

  • Launch the websocket server:

npm run start

Voila! The transcription should appear in the server logs now.

🔗 Source GitHub repository is available here.

Feel free to check out the video version of the tutorial for a step-by-step walkthrough with one of our software engineers, Antoine.

We hope you enjoyed this how-to tutorial! Given how much audio data still goes to wasted, we’re always curious to explore the many ways in which transcription tech can be used to remedy that. Let us know if you end up using our API with Twilio, Discord, or other, we’d love to hear from you.

About Gladia

At Gladia, we built an optimized version of Whisper in the form of an API, adapted to real-life professional use cases and distinguished by exceptional accuracy, speed, extended multilingual capabilities and state-of-the-art features, including speaker diarization and word-level timestamps.

Contact us

280
Your request has been registered
A problem occurred while submitting the form.

Read more