Use case

Private copilots.One key, many models

In-house assistants grounded in your own documents and data.

Why open models fit

Retrieval, document reading, and grounded answers on open models, graduating to dedicated endpoints or AI Factory on the same API when governance demands it.

The pipeline

One OpenAI-compatible key runs the whole pipeline. Swap any step's model without touching your loop.

1 Serving
RetrieveBGE-M3
Embeddings over your internal documents.
8K ctx · $0.01 per M
2 Serving
Read documentsPaddleOCR-VL 1.5
Vision OCR to read scanned and image-based files.
128K ctx · $0.14 / $0.8 per M
3 Serving
AnswerLlama 3.3 70B Instruct
A chat model with a 131K-token context for grounded answers.
64K ctx · $0.1 / $0.32 per M

Call the whole pipeline

Point the OpenAI SDK at FlexAI once. Every step names its own model on the same key: no per-model clients, endpoints, or keys to juggle.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://tokens.flex.ai/v1",
  apiKey: process.env.FLEXAI_API_KEY,
});

const chunks = ["...your source text..."];
// search() → your vector store lookup (pgvector, Pinecone, …)

// 1. Retrieve: same key, swap the model anytime
const retrieve = await client.embeddings.create({
  model: "bge-m3",
  input: chunks,
});

// 2. Read documents: same key, swap the model anytime
const readDocuments = await client.chat.completions.create({
  model: "PaddleOCR-VL",
  messages: [{ role: "user", content: await search(retrieve.data[0].embedding) }],
});

// 3. Answer: same key, swap the model anytime
const answer = await client.chat.completions.create({
  model: "Llama-3.3-70B-Instruct-FP8",
  messages: [{ role: "user", content: readDocuments.choices[0].message.content }],
});

Start serverless and pay per token. When a step becomes steady production traffic, move it to a dedicated endpoint on the same key.

Build your private copilots on FlexAI

Every model in the pipeline behind one OpenAI-compatible key, priced at the market rate.

Get an API key Talk to us

$10/month in free credits for your first 3 months

See how much you could save