Model family

GLM on FlexAI.Every variant. One key

GLM is Zhipu AI's open model line on FlexAI. 2 variants run serverless on the OpenAI-compatible API, with 7 more available as dedicated endpoints, spanning chat, vision, multimodal. One API key serves every variant.

Get an API key All models

Variants

Every served variant in the family, with live serverless pricing.

Serverless · pay per token

Model	Context	Price	Status
GLM 5.2	128K	$0.402 / $1.26 per M	Serving
GLM 4.5 Air	128K	$0.125 / $0.85 per M	Serving

Dedicated endpoints · reserved GPUs

Model	Context	Price	Status
GLM 5.1	198K	Dedicated	Dedicated
GLM 5	128K	Dedicated	Dedicated
GLM 4.6	198K	Dedicated	Dedicated
GLM 4.7	198K	Dedicated	Dedicated
GLM 4.6V	128K	Dedicated	Dedicated
GLM 4.7 Flash	198K	Dedicated	Dedicated
GLM OCR	—	Dedicated	Dedicated

Which variant for what

Pick by the role you're filling. Same key for all of them.

Flagship

GLM 5.2

GLM 5.2 is the largest served serverless variant. Reach for it first.

Fast & economical

GLM 4.5 Air

GLM 4.5 Air is the smallest serverless variant. Lowest latency and cost.

Call the flagship

OpenAI-compatible. Swap the model id for any variant above.

curl https://tokens.flex.ai/v1/chat/completions \
  -H "Authorization: Bearer $FLEXAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "GLM-5.2",
    "messages": [{"role": "user", "content": "Hello from FlexAI"}]
  }'

Run GLM on one API key

Every GLM variant, serverless and dedicated, behind one OpenAI-compatible key.

Get an API key Talk to us

$10/month in free credits for your first 3 months

See how much you could save