Skip to content

    Use case

    Workflow automation.One key, many models

    Agents that chain tools and approvals into reliable multi-step runs.

    Why open models fit

    A capable planner plus many cheap execution steps: splitting those roles across open models keeps per-run cost low without losing planning quality.

    The pipeline

    One OpenAI-compatible key runs the whole pipeline. Swap any step's model without touching your loop.

    1. 1 Serving
      PlanGPT-OSS 120B

      A capable chat model to plan multi-step tool calls.

      128K ctx · $0.035 / $0.09 per M

    2. 2 Serving
      Execute stepsMistral Nemo

      A small, fast chat model for high-volume tool-calling steps.

      128K ctx · $0.018 / $0.027 per M

    Call the whole pipeline

    Point the OpenAI SDK at FlexAI once. Every step names its own model on the same key: no per-model clients, endpoints, or keys to juggle.

    import OpenAI from "openai";
    
    const client = new OpenAI({
      baseURL: "https://tokens.flex.ai/v1",
      apiKey: process.env.FLEXAI_API_KEY,
    });
    
    const prompt = "Describe what you want the agent to do.";
    
    // 1. Plan: same key, swap the model anytime
    const plan = await client.chat.completions.create({
      model: "gpt-oss-120b",
      messages: [{ role: "user", content: prompt }],
    });
    
    // 2. Execute steps: same key, swap the model anytime
    const executeSteps = await client.chat.completions.create({
      model: "Mistral-Nemo-Instruct-2407-FP8",
      messages: [{ role: "user", content: plan.choices[0].message.content }],
    });

    Start serverless and pay per token. When a step becomes steady production traffic, move it to a dedicated endpoint on the same key.

    Build your workflow automation on FlexAI

    Every model in the pipeline behind one OpenAI-compatible key, with source-linked pricing.

    $10/month in free credits for your first 3 months