Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Tutorials

Transcribing long audios with Whisper using Python and Gladia API

Whisper ASR model released by OpenAI is great for providing transcriptions from audio files but doesn’t come without challenges. In addition to high computational requirements and expenses, Whisper comes with a limit of 25 MB and 30 seconds in duration on input audio files, which usually requires splitting larger audio files into chunks to be transcribed.

Speech-To-Text

Maximizing CRM enrichment with AI audio transcription

In today's fast-paced commercial environment, Customer Relation Management (CRM) systems like Salesforce and HubSpot have become the backbone of successful customer success and sales strategies. Yet, keeping CRMs up to date and in sync with the vast volumes of customer information generated daily has been a challenge to solve.

Product News

Introducing Whisper-Zero

Today, we're thrilled to release a new breakthrough ASR system, Whisper-Zero —a complete rework of Whisper combined with multiple state-of-the-art models, using over 1.5 million hours of diverse audio, including phone-quality and noisy data from real-life environments.

Tutorials

How to build a Google Meet Bot for recording and video transcription

Tools like Google Meet have revolutionized how we connect and conduct meetings remotely. However, it can be very challenging to keep track of all action items and key insights shared during long meetings.

Speech-To-Text

An introduction to ASR speaker recognition: identification, verification and diarization

Due to individual differences in physical attributes like vocal tract shapes, every person possesses a distinct voice pattern. In automatic speech recognition (ASR), this uniqueness is harnessed to identify and analyze speakers by extracting and analyzing voice features such as pitch and frequencies.

Tutorials

Building a Whisper YouTube transcription generator for automated captioning

With over 500 hours of video uploaded to YouTube every minute, providing accurate captions and transcripts is essential for creators to make their content engaging and accessible. However, manually transcribing long videos is tedious and time-consuming.