Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Mastering real-time transcription: speed, accuracy, and Gladia's AI advantage
TL;DR: Most use cases like meeting assistants, post-call analytics, and note-taking tools don't need real-time transcription. Async delivers higher accuracy and better speaker attribution because the model processes the complete recording. Sub-300ms latency is a functional requirement only for voice agents, live captions, and live agent assist tools where immediate output is non-negotiable. Gladia's Solaria-1 delivers around 270ms average latency with 100+ language support and native code-switching for the use cases that do require it.
Automated call scoring: Best practices for AI-powered QA and performance
TL;DR: Most contact centers manually review only a fraction of calls, leaving coaching decisions based on incomplete data. Automated call scoring closes that gap by combining async transcription with LLM-based evaluation, but every downstream score is bounded by the accuracy of your STT layer. When it fails on accented speakers or multilingual audio, compliance scores, sentiment flags, and coaching alerts all break, making STT engine selection the highest-leverage infrastructure decision in your QA stack.
Generate automated follow-up emails from meeting recordings with Gladia and Claude
TL;DR: The bottleneck in automated meeting follow-ups is not the LLM writing the email. It's the transcription layer feeding it: wrong speaker labels and missed entities produce emails that sound generic or silently corrupt your CRM. Building your own pipeline with Gladia and Claude gives you predictable per-hour billing and strict data controls on paid tiers, backed by Solaria-1's on average 29% lower WER than competing APIs on conversational speech.
How Aircall cut transcription time by 95% with Gladia
Published on Oct 9, 2025
The contact center is transforming. Traditionally defined by manual workflows, siloed data, and reactive customer service, today's Contact Center as a Service (CCaaS) platforms are embracing a new era—one driven by real-time AI and automation.
Transcription lies at the core of this transformation. Converting voice to text with speed and precision unlocks a cascade of next-gen capabilities: automated summaries, sentiment detection, agent coaching, CRM enrichment, and more. But many legacy or in-house solutions fall short—too slow, too inaccurate, or too resource-heavy to scale.
Aircall, the leading AI-powered voice platform for growing businesses, recognized this inflection point early. To meet the growing demand for fast, intelligent insights from customer conversations, Aircall turned to Gladia’s speech-to-text API.
Here’s how Aircall reduced transcription time by 95%, empowered its users with near-instant insights, and laid the groundwork for a smarter, AI-driven CCaaS future.
About Aircall
Aircall is an integrated customer communications and intelligence platform. It unifies voice and digital channels into one seamless platform, offering one-click integrations with leading CRMs and over 250 business tools. With a strong focus on cloud-based voice solutions, Aircall helps teams streamline conversations, improve customer support, and drive sales efficiency.
Farid Issabhai, Staff Engineer at Aircall, is at the forefront of Aircall’s AI and transcription initiatives. He played a key role in integrating cutting-edge technologies, including Gladia’s speech-to-text API, into Aircall’s workflows.
Challenge: More accurate, fast, and scalable transcription for global telephony
As a leading voice platform, Aircall processes thousands of calls every day across diverse languages and use cases, from customer support to sales interactions. Initially, Aircall developed an in-house transcription engine, but maintaining and improving it proved challenging.
Solution: Gladia’s speech-to-text API
After evaluating different STT API vendors, Aircall chose Gladia for its strong performance in transcription accuracy, especially for key strategic languages.
Gladia’s API allowed Aircall to:
✓ Transcribe calls across multiple languages like Spanish, German, and Italian.
✓ Process over 1M transcriptions per week
✓ Deliver transcripts significantly faster than their previous solutions.
How Aircall uses transcription
Aircall integrates Gladia’s transcriptions as a foundational layer for advanced features for their CCaaS platform:
Searchability: Users can search for keywords across calls
AI-generated insights: Summaries, key topics, and sentiment analysis are built on top of the transcripts
Agent coaching: Aircall’s coaching features assess calls for compliance and training, evaluating factors like greetings or responses to objections
CRM Integration: While transcriptions aren’t logged directly into CRMs like HubSpot or Salesforce, summaries and AI insights are pushed via webhooks
Farid explains,
Why Aircall chose Gladia
Aircall’s decision to partner with Gladia was driven by:
Accuracy: High performance across key languages benchmarked on internal datasets composed of phone call audio
Speed: Drastically reduced transcription delays
Developer Experience: A well-designed API that simplified integration
Cost-Effectiveness: A solution that balances performance with the economics of scaling
Results: Faster insights, smoother operations
Since switching to Gladia:
Transcription times have dropped from up to 30 minutes to under 1.5 minutes
Aircall processes around 1M calls weekly, enabling scalable AI features
Improved user satisfaction by delivering faster insights
Farid highlights,
Looking ahead
Aircall is exploring new frontiers with real-time transcription and AI voice agents. While asynchronous transcription currently meets most needs, the team is actively experimenting with new features like real-time assistance during sales calls, where AI can suggest responses based on conversation context.
Farid shares,
Final thoughts
Farid’s advice for companies looking to integrate speech-to-text AI:
About Gladia
Gladia provides a speech-to-text and audio intelligence API for building virtual meeting and note-taking apps, call center platforms, and media products, providing transcription, translation, and insights powered by best-in-class ASR, LLMs, and GenAI models.
After reading this case study, do you think Gladia could be the right fit for your business?