Capability
Speech-to-text,in one request.
Transcribe audio with open speech models on the OpenAI-compatible /v1/audio/transcriptions endpoint. Drop it into voice agents, meeting tools, and call analytics, billed per minute.
Models you can call
Served on FlexAI today, on the OpenAI-compatible API.
| Model | Context | Price | Status |
|---|---|---|---|
| Whisper Large V3 Turbo | — | $0.0006 / min | Serving |
| NVIDIA Parakeet TDT 0.6B v3 | — | $0.0014 / min | Serving |
Call it
OpenAI-compatible /v1/audio/transcriptions. Same key as every other model.
curl https://tokens.flex.ai/v1/audio/transcriptions \
-H "Authorization: Bearer $FLEXAI_API_KEY" \
-F model="whisper-large-v3-turbo" \
-F file=@audio.mp3In the pipeline
Use cases that put this capability to work.
Start serverless and pay per use. When volume proves out, move to a dedicated endpoint on the same key.
Every modality, one API key
Text, vision, image, audio, and embeddings behind one OpenAI-compatible key.
$10/month in free credits for your first 3 months