Name: FlexAI Token Factory
Brand: FlexAI
Price: 0.225 USD
Availability: InStock

Question 1

How is FlexAI priced?

Accepted Answer

Token Factory is usage-priced per unit: per token for text models, and for media models per image, per million characters of speech, per minute of audio, or per generated video clip. Each rate is set against the credible market rate, repriced automatically as the market moves, with the public source linked per row. Dedicated endpoints are priced per GPU-hour, and AI Factory is priced per deployment. See the Dedicated and Custom tabs.

Question 2

How does FlexAI set each model's price?

Accepted Answer

Every model is priced against the credible market rate (pricepertoken.com, Artificial Analysis, provider pages) and set below it, repriced automatically when the source moves. See "How our pricing works" below, and click any row for its source.

Question 3

How often does pricing update?

Accepted Answer

Automatically. A scheduled refresh re-reads the sources, so when the tracked market rate moves, our rate moves with it.

Question 4

What if a source is unreliable?

Accepted Answer

Every row is attributed to a named, verified source. Unreliable sources and outliers are excluded: we track credible published prices, not whatever number happens to be on the internet.

Question 5

Are any models priced differently?

Accepted Answer

The open-weight catalog follows the rule above. Frontier/closed passthroughs and dedicated deployments are priced separately. See the Dedicated and Custom tabs.

Question 6

How does Dedicated pricing work vs Serverless?

Accepted Answer

Serverless bills per token, best for spiky or early-stage volume. Dedicated bills per GPU-hour for your own endpoints, best once steady-state volume crosses the break-even point.

Question 7

When should I switch from Serverless to Dedicated?

Accepted Answer

When sustained volume makes a reserved GPU cheaper than per-token billing. The crossover calculator shows your exact break-even.

Question 8

I'm coming from OpenAI or Claude. How hard is the switch?

Accepted Answer

From the OpenAI SDK it's a two-value change: point base_url at tokens.flex.ai/v1 and swap the model. From the Claude API it's a small refactor. The migration guide has runnable snippets and a closed-to-open model mapping.

Model	Input $/M	Output $/M	Context	Source
DeepSeek V3.2	$0.225	$0.225	160K	source
GPT-OSS 120B	$0.035	$0.090	128K	source
GPT-OSS 20B	$0.027	$0.117	128K	source
Llama 3.1 8B Instruct	$0.018	$0.027	128K	source
Llama 3.3 70B Instruct	$0.090	$0.288	128K	source
Mistral Nemo 12B	$0.018	$0.027	128K	source
Qwen3 Coder 30B A3B	$0.063	$0.234	128K	source
Qwen3.6-35B-A3B	$0.090	$0.135	128K	source
GLM 4.7	$0.360	$1.386	128K	source
Gemma 4 31B IT	$0.108	$0.315	128K	source
Gemma 4 26B A4B	$0.054	$0.270	64K	source
PaddleOCR-VL 0.9B	$0.126	$0.720	128K	source
Mistral Small 3.1 24B	$0.045	$0.090	128K	source
Nemotron 3 Super 120B A12B	$0.081	$0.405	32K	source
GLM 5	$0.540	$1.728	128K	source
GLM 4.7 Flash	$0.054	$0.360	128K	source
MiniMax M2.7	$0.225	$0.900	195K	source
Qwen3.5 9B	$0.050	$0.150	256K	source
Phi 4 Multimodal	$0.045	$0.090	128K	source
GLM 4.5 Air FP8	$0.113	$0.765	128K	source
DeepSeek R1 Distill 32B	$0.243	$0.243	32K	source
DeepSeek V4 Flash	$0.081	$0.162	256K	source
Qwen 3 8B	$0.050	$0.150	32K	source
Qwen 2.5 32B	$0.800	$0.800	32K	source
Qwen 3 30B Thinking	$0.100	$0.150	128K	source
Qwen 3 Coder Next	$0.063	$0.243	128K	source
Nemotron 3 Nano 30B A3B FP8	$0.100	$0.100	32K	source
Kimi K2.5	$0.315	$1.701	64K	source
Llama 4 Maverick	$0.135	$0.540	256K	source
MiniMax M2.5	$0.135	$0.810	128K	source
Qwen3 235B A22B 2507	$0.064	$0.090	128K	source
Qwen3 Coder 480B A35B	$0.162	$0.900	32K	source

Model	Per image	Context	Source
FLUX.1 Schnell	$0.0005	256	source
Stable Diffusion 3.5 Large	$0.0585	—	source
FLUX.2 [dev]	$0.0400	—	source

Model	Rate	Context	Source
Whisper Large V3 Turbo	$0.0006 / min	448	source
Parakeet TDT 0.6B v3	$0.0014 / min	448	source
Kokoro-82M	$9 / M chars	512	source
Voxtral 4B TTS	$180 / M chars	—	source

Tier	Price	Includes	SLA
Starter	$10/month in free credits for your first 3 months, card required at signup, then pay-as-you-go at our published per-token rates	2 workspace seats. OpenAI-compatible API. Dedicated endpoints on demand. Playground access.	99%
Essential	$100 matched ($100 deposit → $200 credit). Contact us today; self-serve soon.	8 seats. Concurrency + multi-fractional. Architecture call at $50 spend. SE access. HIPAA, DORA.	99.5%
Custom	Custom pricing on your workload. Talk to us.	VPC, on-prem, air-gapped. Dedicated CSM. Geo redundancy.	99.9%

Predictable pricing.Auditable per model

Serverless: Token Factory

Dedicated GPU pricing

On FlexAI

Custom: AI Factory & Enterprise

Cloud Savings Calculator

Want the breakdown as a PDF?

How our pricing works

How the rate is set

Auditable per row

Why this matters

Frequently Asked Questions