Model family
GPT-OSS on FlexAI.Every variant. One key
GPT-OSS is OpenAI's open model line on FlexAI. 2 variants run serverless on the OpenAI-compatible API, spanning chat. One API key serves every variant.
Variants
Every served variant in the family, with live serverless pricing.
Serverless · pay per token
| Model | Context | Price | Status |
|---|---|---|---|
| GPT-OSS 120B | 128K | $0.035 / $0.09 per M | Serving |
| GPT-OSS 20B | 128K | $0.027 / $0.117 per M | Serving |
Which variant for what
Pick by the role you're filling. Same key for all of them.
Fast & economical
GPT-OSS 20B
GPT-OSS 20B is the smallest serverless variant. Lowest latency and cost.
GPT-OSS 20B runs review & fast edits in coding agents · GPT-OSS 20B runs respond in support agents · GPT-OSS 120B runs plan in workflow automation
Call the flagship
OpenAI-compatible. Swap the model id for any variant above.
curl https://tokens.flex.ai/v1/chat/completions \
-H "Authorization: Bearer $FLEXAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"messages": [{"role": "user", "content": "Hello from FlexAI"}]
}'Where it runs
Use cases that put GPT-OSS to work in a pipeline.
Run GPT-OSS on one API key
Every GPT-OSS variant, serverless and dedicated, behind one OpenAI-compatible key.
$10/month in free credits for your first 3 months