Gladia

Tutorials

Enhancing real-time transcription with WebSockets and Golang

Communication has evolved from sending post letters and waiting at phone booths to digital connections, happening simultaneously at a high speed, especially in business environments. Given the unprecedented volumes of voice data generated by companies daily, the ability to document customer calls, conferences, and online meetings in real-time and asynchronously is becoming crucial.

Tutorials

Transcribing long audios with Whisper using Python and Gladia API

Whisper ASR model released by OpenAI is great for providing transcriptions from audio files but doesn’t come without challenges. In addition to high computational requirements and expenses, Whisper comes with a limit of 25 MB and 30 seconds in duration on input audio files, which usually requires splitting larger audio files into chunks to be transcribed.

Speech-To-Text

Maximizing CRM enrichment with AI audio transcription

In today's fast-paced commercial environment, Customer Relation Management (CRM) systems like Salesforce and HubSpot have become the backbone of successful customer success and sales strategies. Yet, keeping CRMs up to date and in sync with the vast volumes of customer information generated daily has been a challenge to solve.

Product News

Introducing Whisper-Zero

Today, we're thrilled to release a new breakthrough ASR system, Whisper-Zero —a complete rework of Whisper combined with multiple state-of-the-art models, using over 1.5 million hours of diverse audio, including phone-quality and noisy data from real-life environments.

Tutorials

How to build a Google Meet Bot for recording and video transcription

Tools like Google Meet have revolutionized how we connect and conduct meetings remotely. However, it can be very challenging to keep track of all action items and key insights shared during long meetings.

Speech-To-Text

An introduction to ASR speaker recognition: identification, verification and diarization

Due to individual differences in physical attributes like vocal tract shapes, every person possesses a distinct voice pattern. In automatic speech recognition (ASR), this uniqueness is harnessed to identify and analyze speakers by extracting and analyzing voice features such as pitch and frequencies.

Blog

Enhancing real-time transcription with WebSockets and Golang

Transcribing long audios with Whisper using Python and Gladia API

Maximizing CRM enrichment with AI audio transcription

Introducing Whisper-Zero

How to build a Google Meet Bot for recording and video transcription

An introduction to ASR speaker recognition: identification, verification and diarization

From audio to knowledge

Subscribe to receive latest news, product updates and curated AI content.

Blog

Newsletter