Model family
DeepSeek on FlexAI.Every variant. One key
DeepSeek is DeepSeek's open model line on FlexAI. 2 variants run serverless on the OpenAI-compatible API, with 7 more available as dedicated endpoints, spanning chat. One API key serves every variant.
Variants
Every served variant in the family, with live serverless pricing.
Serverless · pay per token
| Model | Context | Price | Status |
|---|---|---|---|
| DeepSeek V3.2 | 160K | $0.225 / $0.225 per M | Serving |
| DeepSeek V4 Flash | 1.0M | $0.082 / $0.164 per M | Serving |
Dedicated endpoints · reserved GPUs
| Model | Context | Price | Status |
|---|---|---|---|
| DeepSeek V4 Pro | 1.0M | Dedicated | Dedicated |
| DeepSeek R1 | 160K | Dedicated | Dedicated |
| DeepSeek V3 0324 | 160K | Dedicated | Dedicated |
| DeepSeek R1 0528 | 160K | Dedicated | Dedicated |
| DeepSeek V3 | 32K | Dedicated | Dedicated |
| DeepSeek R1 Distill Qwen 32B | 32K | Dedicated | Dedicated |
| DeepSeek R1 Distill Qwen 1.5B | 32K | Dedicated | Dedicated |
Which variant for what
Pick by the role you're filling. Same key for all of them.
Fast & economical
DeepSeek V4 Flash
DeepSeek V4 Flash is the smallest serverless variant. Lowest latency and cost.
DeepSeek V3.2 runs reason in research agents · DeepSeek V4 Flash runs summarize in research agents
Call the flagship
OpenAI-compatible. Swap the model id for any variant above.
curl https://tokens.flex.ai/v1/chat/completions \
-H "Authorization: Bearer $FLEXAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "DeepSeek-V3.2",
"messages": [{"role": "user", "content": "Hello from FlexAI"}]
}'Where it runs
Use cases that put DeepSeek to work in a pipeline.
Run DeepSeek on one API key
Every DeepSeek variant, serverless and dedicated, behind one OpenAI-compatible key.
$10/month in free credits for your first 3 months