Skip to content

    Model family

    GLM on FlexAI.Every variant. One key

    GLM is Zhipu AI's open model line on FlexAI. One variant runs serverless on the OpenAI-compatible API, with 5 more available as dedicated endpoints, spanning chat, vision. One API key serves every variant.

    Variants

    Every served variant in the family, with live serverless pricing.

    Serverless · pay per token

    ModelContextPriceStatus
    GLM 4.5 Air128K$0.113 / $0.765 per M Serving

    Dedicated endpoints · reserved GPUs

    ModelContextPriceStatus
    GLM 5.1198KDedicatedDedicated
    GLM 5195KDedicatedDedicated
    GLM 4.7198KDedicatedDedicated
    GLM 4.7 Flash198KDedicatedDedicated
    GLM OCRDedicatedDedicated

    Which variant for what

    Pick by the role you're filling. Same key for all of them.

    Flagship

    GLM 4.5 Air

    GLM 4.5 Air is the largest served serverless variant. Reach for it first.

    Call the flagship

    OpenAI-compatible. Swap the model id for any variant above.

    curl https://tokens.flex.ai/v1/chat/completions \
      -H "Authorization: Bearer $FLEXAI_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "GLM-4.5-Air-FP8",
        "messages": [{"role": "user", "content": "Hello from FlexAI"}]
      }'

    Run GLM on one API key

    Every GLM variant, serverless and dedicated, behind one OpenAI-compatible key.

    $10/month in free credits for your first 3 months