NVIDIA API Pricing

Real-time API token pricing for NVIDIA AI models including Llama Nemotron and other NVIDIA-hosted models. Compare input/output costs via OpenRouter.

12 active models ยท Data updated hourly

About NVIDIA

NVIDIA hosts and fine-tunes AI models through NVIDIA NIM (NVIDIA Inference Microservices) and makes them available via API. Their offerings include fine-tuned variants of Meta Llama and other open-weight models optimized for NVIDIA hardware. NVIDIA also produces their own models like Nemotron, designed for enterprise workloads on NVIDIA GPU infrastructure.

Model Released Context Input $/1M Output $/1M Modalities
NVIDIA: Nemotron 3 Ultra Jun 2026 1M $0.500 $2.50 Text
NVIDIA: Nemotron 3 Super Mar 2026 1M $0.090 $0.450 Text
NVIDIA: Nemotron 3 Nano 30B A3B Dec 2025 262K $0.050 $0.200 Text
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 Oct 2025 131K $0.100 $0.400 Open Source
NVIDIA: Nemotron Nano 9B V2 Sep 2025 131K $0.040 $0.160 Text
NVIDIA: Nemotron 3.5 Content Safety (free) Jun 2026 128K Free Free Vision, Free
NVIDIA: Nemotron 3 Ultra (free) Jun 2026 1M Free Free Free
NVIDIA: Nemotron 3 Nano Omni (free) Apr 2026 256K Free Free Vision, Audio, Free
NVIDIA: Nemotron 3 Super (free) Mar 2026 1M Free Free Free
NVIDIA: Nemotron 3 Nano 30B A3B (free) Dec 2025 256K Free Free Free
NVIDIA: Nemotron Nano 12B 2 VL (free) Oct 2025 128K Free Free Vision, Free
NVIDIA: Nemotron Nano 9B V2 (free) Sep 2025 128K Free Free Free