Use case

Workflow automation.One key, many models

Agents that chain tools and approvals into reliable multi-step runs.

Why open models fit

A capable planner plus many cheap execution steps: splitting those roles across open models keeps per-run cost low without losing planning quality.

The pipeline

One OpenAI-compatible key runs the whole pipeline. Swap any step's model without touching your loop.

1 Serving
PlanGPT-OSS 120B
A capable chat model to plan multi-step tool calls.
128K ctx · $0.039 / $0.1 per M
2 Serving
Execute stepsMistral Nemo
A small, fast chat model for high-volume tool-calling steps.
32K ctx · $0.018 / $0.03 per M

Call the whole pipeline

Point the OpenAI SDK at FlexAI once. Every step names its own model on the same key: no per-model clients, endpoints, or keys to juggle.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://tokens.flex.ai/v1",
  apiKey: process.env.FLEXAI_API_KEY,
});

const prompt = "Describe what you want the agent to do.";

// 1. Plan: same key, swap the model anytime
const plan = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: prompt }],
});

// 2. Execute steps: same key, swap the model anytime
const executeSteps = await client.chat.completions.create({
  model: "Mistral-Nemo-Instruct-2407-FP8",
  messages: [{ role: "user", content: plan.choices[0].message.content }],
});

Start serverless and pay per token. When a step becomes steady production traffic, move it to a dedicated endpoint on the same key.

Build your workflow automation on FlexAI

Every model in the pipeline behind one OpenAI-compatible key, priced at the market rate.

Get an API key Talk to us

$10/month in free credits for your first 3 months

See how much you could save