Speech & Audio

Text-to-speech, speech-to-text, and audio processing APIs

14 tools

AssemblyAI

Speech-to-text and audio intelligence API with transcription, summarization, sentiment analysis, and topic detection.

Azure AI Speech

Cloud-based speech AI service providing speech-to-text, text-to-speech, speech translation, and speaker recognition APIs...

Speech & Audio

Cartesia

Freemium

Real-time text-to-speech API with ultra-low latency voice generation, voice cloning, and streaming audio for voice agent...

Speech & Audio

Deepgram

Freemium

Real-time and batch speech-to-text API with state-of-the-art ASR models, speaker diarization, and voice AI features.

Speech & Audio

ElevenLabs

Freemium

AI-powered speech platform offering text-to-speech, speech-to-text, voice cloning, and conversational AI agents. Differe...

Speech & Audio

Hume AI

Freemium

Emotionally intelligent voice AI platform offering text-to-speech, speech-to-speech, expression measurement, and human e...

Speech & Audio

LMNT

Freemium

Ultra-low latency text-to-speech API optimized for real-time voice applications and conversational AI agents.

Speech & Audio

OpenAI TTS

Paid

Text-to-speech API with 6 natural voices. HD mode available. Great for audiobook and accessibility use cases.

Speech & Audio

OpenAI Whisper API

Paid

OpenAI's managed speech-to-text API powered by the Whisper model. Transcribes and translates audio in 99+ languages with...

Speech & Audio

PlayHT

Freemium

Text-to-speech and voice cloning API with 900+ AI voices across 142 languages and real-time streaming capability.

Speech & Audio

Resemble AI

Freemium

Voice cloning and AI speech generation platform with real-time voice synthesis, neural TTS, and voice watermarking.

Speech & Audio

Rev AI

Freemium

Enterprise-grade speech recognition API offering async and streaming transcription with high accuracy across diverse aud...

Speech & Audio

Suno

Freemium

AI music generation from text prompts. Create full songs with vocals, instruments, and lyrics.

Speech & Audio

Whisper

Open Source

Open-source automatic speech recognition model by OpenAI trained on 680k hours of multilingual data, available for self-...

Speech & Audio