Groq
Model InferenceFreemiumVerified
Cloud inference API providing access to open-source LLMs, speech, and TTS models via an OpenAI-compatible endpoint. Uses custom LPU (Language Processing Unit) silicon purpose-built for inference, delivering ultra-fast token generation at low cost. Best for latency-sensitive AI applications requiring fast responses from open-source models.
Capabilities
Fine-tuning
Price
$0 – $3/ per 1M tokens