Pricing
Get started
Get started
new
Download our ultimate guide to using LLMs with ASR to build voice apps

Audio infrastructure

to transform

note-taking

customer support

sales assistance

user experience

note-taking

Audio infrastructure

to transform

note-taking

customer support

sales assistance

user experience

note-taking

Audio infrastructure

to transform

note-taking

customer support

sales assistance

user experience

note-taking

Everything starts with reliable transcription. From async to live streaming, our API empowers your platform with accurate, multilingual speech-to-text and actionable insights.

Trusted by 600+ AI assistants and contact center platforms

Accelerate your roadmap with top-tier models for speech recognition and analysis

Transcribe calls in milliseconds

Gladia’s speech-to-text engine converts calls and meetings into text in real time or asynchronously, making it easy to integrate conversational features, advanced note-taking and search into your platform.

Key information and insights with no errors

Leveraging our audio intelligence add-ons, you can retrieve key information and insights in real time for meeting notes, CRM enrichment and other LLM-powered capabilities of your product. 100% accuracy where it matters the most.

Integrate advanced real-time guidance features

Our real-time transcription API is optimized to provide next-best-action recommendations to customer support and sales agents while on-call. Compatible with SIP and WebSockets.
COMING SOON

Build AI conversational call agents and bots

Gain access to the best ASR models and tools to create autonomous AI voice agents capable of understanding speech and handling more complex customer queries in real-time.

High precision and instant results at no deployment cost

Latency

Less than 300 ms to transcribe a call or meeting in real-time, with minimal additional latency to generate summaries and extract insights.

Accuracy

Speech recognition without errors and hallucinations for ultimate information fidelity, whatever the language or tech stack used.

Language support

Multilingual transcription and insights with enhanced support for accents, any-to-any translation and code-switching.

Quick integration

Our tools are compatible with WebSockets, VoIP, SIP, and all other standard telephony protocols and integrate seamlessly with any stack.

High security at scale

We guarantee 100% safety of all user data per EU and US regulations and compliance frameworks.

Optimized for enterprise use cases

Customer experience

Real-time AI to boost productivity of call agents worldwide

Sales enablement

AI transcription and insights to transform sales calls

Meeting assistants

Flawless transcription for advanced note-taking assistants

Content and media

Streamlined editing and subtitles with time-stamped transcripts

Trusted by leading platforms

Explore what other voice-first platforms have to say.

"There’s a lot more than one can get out of audio than just transcription, and Gladia understood that. Feature rollouts are proactive, and anticipate our needs as a platform. Their API performs very well with noisy telephony and stereo audio and does an excellent job with languages."
Alexandre Bouju
CTO Deputy Manager
"The quality of the output from our platform, everything that we do based on this transcription became better after we switched to Gladia."
Valentin van Gastel
VP of Product & Engineering
"We are 100% benchmark and evaluation driven. Gladia was one of the best providers selected on merit to transcribe user videos, especially for non-English languages. Their reactive customer support and data compliance make their offer really compelling."
Kojo Hinson
Group Engineering Manager
"We initially attempted to host Whisper Al, which required significant effort to scale. Switching to Gladia's transcription service brought a welcome change."
Robin Lambert
CPO
"Gladia has a clear-cut advantage when it comes to European languages. With their API, we acquired new users in countries like Finland and Sweden, who say it's the best transcription they've ever tried."
Lazare Rossillon
CEO
"Having tried numerous speech-to-text solutions, I can confidently say: Gladia's API outshines the rest. Their balance of accuracy, speed, and precise word timings is unparalleled."
Jean Patry
Co-founder
"It's the first time we've been able to transcribe video with such accuracy and speed - including when the conversation is technical. Whatever the language or accent, the quality is always there."
Robin Bonduelle
CEO
"There’s a lot more than one can get out of audio than just transcription, and Gladia understood that. Feature rollouts are proactive, and anticipate our needs as a platform. Their API performs very well with noisy telephony and stereo audio and does an excellent job with languages."
Alexandre Bouju
CTO Deputy Manager
"The quality of the output from our platform, everything that we do based on this transcription became better after we switched to Gladia."
Valentin van Gastel
VP of Product & Engineering
"We are 100% benchmark and evaluation driven. Gladia was one of the best providers selected on merit to transcribe user videos, especially for non-English languages. Their reactive customer support and data compliance make their offer really compelling."
Kojo Hinson
Group Engineering Manager
"We initially attempted to host Whisper Al, which required significant effort to scale. Switching to Gladia's transcription service brought a welcome change."
Robin Lambert
CPO
"Gladia has a clear-cut advantage when it comes to European languages. With their API, we acquired new users in countries like Finland and Sweden, who say it's the best transcription they've ever tried."
Lazare Rossillon
CEO
"Having tried numerous speech-to-text solutions, I can confidently say: Gladia's API outshines the rest. Their balance of accuracy, speed, and precise word timings is unparalleled."
Jean Patry
Co-founder
"It's the first time we've been able to transcribe video with such accuracy and speed - including when the conversation is technical. Whatever the language or accent, the quality is always there."
Robin Bonduelle
CEO

High precision and instant results at no deployment cost

High precision and instant results at no deployment cost

Built for developers

Add cutting-edge AI to your product in 3 clicks. Our API is compatible with all tech stacks and doesn’t require any AI expertise or setup costs.

async function makeFetchRequest(url: string, options: any) {
  const response = await fetch(url, options);
  return response.json();
}

async function pollForResult(resultUrl: string, headers: any) {
  while (true) {
    console.log("Polling for results...");
    const pollResponse = await makeFetchRequest(resultUrl, { headers });

    if (pollResponse.status === "done") {
      console.log("- Transcription done: \n ");
      console.log(pollResponse.result.transcription.full_transcript);
      break;
    } else {
      console.log("Transcription status : ", pollResponse.status);
      await new Promise((resolve) => setTimeout(resolve, 1000));
    }
  }
}

async function startTranscription() {
  const gladiaKey = "YOUR_GLADIA_API_TOKEN";
  const requestData = {
    audio_url:
      "YOUR_AUDIO_URL",
  };
  const gladiaUrl = "https://api.gladia.io/v2/transcription/";
  const headers = {
    "x-gladia-key": gladiaKey,
    "Content-Type": "application/json",
  };

  console.log("- Sending initial request to Gladia API...");
  const initialResponse = await makeFetchRequest(gladiaUrl, {
    method: "POST",
    headers,
    body: JSON.stringify(requestData),
  });

  console.log("Initial response with Transcription ID :", initialResponse);

  if (initialResponse.result_url) {
    await pollForResult(initialResponse.result_url, headers);
  }
}

startTranscription();
import requests
import time

def make_fetch_request(url, headers, method='GET', data=None):
    if method == 'POST':
        response = requests.post(url, headers=headers, json=data)
    else:
        response = requests.get(url, headers=headers)
    return response.json()

gladia_key = "YOUR_GLADIA_API_TOKEN"
request_data = {"audio_url": "YOUR_AUDIO_URL"}
gladia_url = "https://api.gladia.io/v2/transcription/"

headers = {
    "x-gladia-key": gladia_key,
    "Content-Type": "application/json"
}

print("- Sending initial request to Gladia API...")
initial_response = make_fetch_request(gladia_url, headers, 'POST', request_data)

print("Initial response with Transcription ID:", initial_response)
result_url = initial_response.get("result_url")

if result_url:
    while True:
        print("Polling for results...")
        poll_response = make_fetch_request(result_url, headers)
        
        if poll_response.get("status") == "done":
            print("- Transcription done: \n")
            print(poll_response.get("result", {}).get("transcription", {}).get("full_transcript"))
            break
        else:
            print("Transcription status:", poll_response.get("status"))
        time.sleep(1)
ss Lower AI infrastructure costs. We leverage a proprietary know-how to fit more AI on less hardware — without compromising on quality and performance.
ss Technical edge. With Gladia, you get access to an optimized version of the most sophisticated ASR models and regular software upgrades at no extra cost.
ss Reduced time-to-market. By embedding advanced AI into your applications directly, your users can derive full value from your product from day one.
ss Easy-to-scale. Increase your processing capacity easily with our pay-as-you go system. Our enterprise-grade API is built to adapt to your ever-growing needs.

All your questions. Answered.

What are the key features of Gladia’s audio transcription API?
On top of supporting 100+ languages across both highly accurate asynchronous and real-time transcription, at <300 milliseconds latency, Gladia also offers a layer of add-ons. These range from custom vocabulary, diarization and sentiment analysis to named entity recognition, word-level timestamps, summarization and more.
What languages does Gladia’s speech-to-text API support?
Gladia’s Speech-to-Text API supports 100+ languages and accents: afrikaans, albanian, amharic, arabic, armenian, assamese, azerbaijani, bashkir, basque, belarusian, bengali, bosnian, breton, bulgarian, burmese, castilian, catalan, chinese, croatian, czech, danish, dutch, english, estonian, faroese, finnish, flemish, french, galician, georgian, german, greek, gujarati, haitian, haitian creole, hausa, hawaiian, hebrew, hindi, hungarian, icelandic, indonesian, italian, japanese, javanese, kannada, kazakh, khmer, korean, lao, latin, latvian, letzeburgesch, lingala, lithuanian, luxembourgish, macedonian, malagasy, malay, malayalam, maltese, maori, marathi, moldavian, moldovan, mongolian, myanmar, nepali, norwegian, nynorsk, occitan, panjabi, pashto, persian, polish, portuguese, punjabi, pushto, romanian, russian, sanskrit, serbian, shona, sindhi, sinhala, sinhalese, slovak, slovenian, somali, spanish, sundanese, swahili, swedish, tagalog, tajik, tamil, tatar, telugu, thai, tibetan, turkish, turkmen, ukrainian, urdu, uzbek, valencian, vietnamese, welsh, yiddish, yoru.
How can I get started with implementing Gladia’s API in my product?
Gladia’s API is extremely easy to implement. To get started, sign up at app.gladia.io. You can choose between trying our product in the playground environment or click ‘Home’ and ‘Generate new API key’ straight away. You can find all the information you need in our developer’s documentation.
How does Gladia’s Speech-to-Text API work?
Gladia’s audio transcription API - also called a Speech-to-Text API - allows developers and product owners to add both asynchronous and real-time transcription, as well as a selection of audio intelligence add-ons, to their products by calling on a single API for every audio transcription need. You can find all the information you need in our developer’s documentation. Gladia’s pricing has three tiers: free access, Pay-as-you-Go, and Enterprise. You can find more information on the Pricing page. Gladia’s single API is compatible with all existing tech stacks and telephony protocols, including SIP, VoIP, FreeSwitch and Asterisk.
Do you offer support for multiple programming languages?
Absolutely! Our API is designed to be language-agnostic, meaning you can use it with any programming language that can make HTTP requests. We provide code examples in multiple languages to assist developers in integrating our speech-to-text API.
What audio formats does Gladia support?
Gladia’s audio transcription API supports a wide range of audio formats and codecs, from WAV and m4a to flac and aac. The full list is available in our documentation under "Supported files & duration," but make sure to reach out to our team if you encounter any issues with your specific file format.
What type of companies use Gladia’s audio transcription API?
Any company that manages or produces audio or video data can benefit from Gladia’s Speech-to-Text technology. Among others, we work with: Virtual meeting providers, note-takers and collaboration platforms use audio transcription to help their customers store and exploit vast amounts of meeting data, giving them access to a previously untapped source of internal knowledge. Contact centers, technology providers, sales enablement- and CRM enrichment platforms improve their performance with real-time transcription, detailed analytics and insights, as well as AI voice companies using STT and TTS APIs in their services and selling to businesses that require enhanced communication capabilities. Audio, video, and media production companies like streaming platforms, screencast or podcast production software, media platforms and forums, and audio and video recording or sharing products all use audio and video transcription. Both to make their content exponentially faster to catalog, access and search for, as well as to generate captions and subtitles. Specialized companies in industries such as medicine, law and finance find great value in speech-to-text technology that is fine-tuned to their specific language.
Is Gladia secure?
At Gladia, we are used to working with organizations with highly sensitive data and extremely tight security requirements. By default, we deliver our audio transcription services in a cloud-hosted environment which can be customized to your geographical footprint. We are able to deliver on-premises hosting, as well as air-gapped hosting, depending on your security requirements. As Gladia already operates in Europe with organizations that require airtight data privacy compliance, Gladia is able to offer GDPR-compliant audio transcription.