Llama 4 Maverick

Multimodal

Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick on FlexAI: Meta LLM, license not listed, served via the OpenAI-compatible Token Factory at the live market-tracked rate.

Pricing

Input

$0.135 / M tokens

Output

$0.54 / M tokens

Primary pricing source

Context

256K tokens

API endpoint

/v1/chat/completions

Compatibility

OpenAI

Estimate your monthly cost

Input tokens / month

M tokens

Output tokens / month

M tokens

10M × $0.135/M input$1.35

2M × $0.54/M output$1.08

Estimated monthly cost$2.43

Estimate only, at the current market-tracked rate. Usage-based; no minimums.

Get an API key

Quick Start

Llama-4-Maverick-17B-128E-Instruct-FP8

from openai import OpenAI

client = OpenAI(base_url="https://tokens.flex.ai/v1", api_key="your-api-key")

response = client.chat.completions.create(
    model="Llama-4-Maverick-17B-128E-Instruct-FP8",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
        ],
    }],
)

print(response.choices[0].message.content)

Get an API key Run this model on a dedicated endpoint