Gladia

Speech-To-Text

Fine-tuning ASR models: Key definitions, mechanics, and use cases

Many modern AI models are built for general-purpose applications and require fine-tuning for domain-specific tasks. The fine-tuning process involves taking an existing model and training it further on domain-specific data. The additional training allows the model to understand the new data and improve its performance in a particular field.

Tutorials

Building a song transcription system with profanity filter using Whisper, GPT 3.5 and Spleeter

The inception of music streaming gained initial popularity in 1999 with the founding of Napster, one of the pioneering streaming platforms. Millions of songs were available to listen to and download for free through the platform using the internet. One no longer needed to buy pre-recorded tapes, go to live shows, or tune into radio stations to listen to music.

Case Studies

AI-powered healthcare assistant enhances medical transcription by 120% with Gladia

Medical transcription is among the most critical and challenging verticals for ASR models to date.

Product News

What is summarization?

Summarization in speech-to-text (STT) AI is a popular feature that streamlines the extraction of essential information from spoken content. By condensing lengthy audio recordings or live conversations into concise summaries, STT summarization enhances user experience, facilitating quicker understanding and decision-making for the final users.

Case Studies

Opening up new markets for a sales meeting and CRM enrichment platform: Spoke's success story with Gladia

In the past, sales teams around the world were presented with a twofold challenge. In addition to showcasing their products in the best light to prospects, they needed to take detailed notes during the call and fill their CRM software manually afterward.

Product News

A new open-source developer app for AI translation, dubbing and lip synching to try

Text-to-speech, voice cloning, and visual dubbing are some of the hottest trends in AI at the moment. Used in tandem with AI transcription and translation, they make it possible to generate hyper-realistic voiceovers, indistinguishable from the sound of the speaker’s natural voice and speech patterns — including in entirely new languages.

Blog

Fine-tuning ASR models: Key definitions, mechanics, and use cases

Building a song transcription system with profanity filter using Whisper, GPT 3.5 and Spleeter

AI-powered healthcare assistant enhances medical transcription by 120% with Gladia

What is summarization?

Opening up new markets for a sales meeting and CRM enrichment platform: Spoke's success story with Gladia

A new open-source developer app for AI translation, dubbing and lip synching to try

From audio to knowledge

Subscribe to receive latest news, product updates and curated AI content.

Blog

Newsletter