Skip to content

    Nemotron 3 Ultra 550B A55B

    Chat

    nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

    Nemotron 3 Ultra 550B A55B on FlexAI: NVIDIA LLM, NVIDIA Open Model License (OpenMDW 1.1), available as a dedicated endpoint on FlexAI or your own infrastructure.

    Pricing

    Input

    $0.45 / M tokens

    Output

    $2.25 / M tokens

    Context

    256K tokens

    API endpoint

    /v1/chat/completions

    Compatibility

    OpenAI

    Parameters

    550B MoE (55B active)

    License

    NVIDIA Open Model License (OpenMDW 1.1)

    Hardware

    16× H100

    Quantization

    FP8

    Nemotron 3 Ultra 550B A55B runs as a dedicated endpoint, provisioned per customer on FlexAI's infrastructure or your own, not served through the shared Token Factory API.