Skip to content

    Model family

    GPT-OSS on FlexAI.Every variant. One key

    GPT-OSS is OpenAI's open model line on FlexAI. 2 variants run serverless on the OpenAI-compatible API, spanning chat. One API key serves every variant.

    Variants

    Every served variant in the family, with live serverless pricing.

    Serverless · pay per token

    ModelContextPriceStatus
    GPT-OSS 120B128K$0.035 / $0.09 per M Serving
    GPT-OSS 20B128K$0.027 / $0.117 per M Serving

    Which variant for what

    Pick by the role you're filling. Same key for all of them.

    Flagship

    GPT-OSS 120B

    GPT-OSS 120B is the largest served serverless variant. Reach for it first.

    Fast & economical

    GPT-OSS 20B

    GPT-OSS 20B is the smallest serverless variant. Lowest latency and cost.

    GPT-OSS 20B runs review & fast edits in coding agents · GPT-OSS 20B runs respond in support agents · GPT-OSS 120B runs plan in workflow automation

    Call the flagship

    OpenAI-compatible. Swap the model id for any variant above.

    curl https://tokens.flex.ai/v1/chat/completions \
      -H "Authorization: Bearer $FLEXAI_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "gpt-oss-120b",
        "messages": [{"role": "user", "content": "Hello from FlexAI"}]
      }'

    Run GPT-OSS on one API key

    Every GPT-OSS variant, serverless and dedicated, behind one OpenAI-compatible key.

    $10/month in free credits for your first 3 months