Skip to content

    Gemma 4 31B IT

    Chat

    gemma-4-31b-it

    Gemma 4 31B IT on FlexAI: Google LLM (Multimodal), Gemma Terms license, served via the OpenAI-compatible Token Factory at the live market-tracked rate.

    Pricing

    Input

    $0.108 / M tokens

    Output

    $0.315 / M tokens

    Pricing source

    Context

    256K tokens

    API endpoint

    /v1/chat/completions

    Compatibility

    OpenAI

    Parameters

    31B MoE (4B active)

    License

    Gemma Terms

    Hardware

    H100

    Quantization

    FP8

    Estimate your monthly cost

    M tokens
    M tokens
    10M × $0.108/M input$1.08
    2M × $0.315/M output$0.63
    Estimated monthly cost$1.71

    Estimate only, at the current market-tracked rate. Usage-based; no minimums.

    Get an API key

    Quick Start

    from openai import OpenAI
    
    client = OpenAI(base_url="https://tokens.flex.ai/v1", api_key="your-api-key")
    
    response = client.chat.completions.create(
        model="gemma-4-31b-it",
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
            ],
        }],
    )
    
    print(response.choices[0].message.content)