Cerebras Inference
Model InferenceFreemiumVerified
Inference API built on Cerebras Wafer-Scale Engine hardware delivering extremely high token throughput. Achieves 2000+ tokens/sec on Llama models, making it one of the fastest inference providers available.
Price
From $0/ 1k_tokens