Skip to content

    Llama 3.3 70B Instruct

    Chat

    Llama-3.3-70B-Instruct-FP8

    Llama 3.3 70B Instruct on FlexAI: Meta LLM, Llama 3.3 Community license, served via the OpenAI-compatible Token Factory at the live market-tracked rate.

    Pricing

    Input

    $0.09 / M tokens

    Output

    $0.288 / M tokens

    Pricing source

    Context

    128K tokens

    API endpoint

    /v1/chat/completions

    Compatibility

    OpenAI

    Parameters

    70B

    License

    Llama 3.3 Community

    Hardware

    2× H100

    Quantization

    FP8

    Estimate your monthly cost

    M tokens
    M tokens
    10M × $0.09/M input$0.9
    2M × $0.288/M output$0.58
    Estimated monthly cost$1.48

    Estimate only, at the current market-tracked rate. Usage-based; no minimums.

    Get an API key

    Quick Start

    from openai import OpenAI
    
    client = OpenAI(
        base_url="https://tokens.flex.ai/v1",
        api_key="your-api-key",
    )
    
    response = client.chat.completions.create(
        model="Llama-3.3-70B-Instruct-FP8",
        messages=[
            {"role": "user", "content": "Hello!"}
        ],
    )
    
    print(response.choices[0].message.content)