NVIDIA API Pricing

Real-time API token pricing for NVIDIA AI models including Llama Nemotron and other NVIDIA-hosted models. Compare input/output costs via OpenRouter.

12 active models · Data updated hourly

About NVIDIA

NVIDIA hosts and fine-tunes AI models through NVIDIA NIM (NVIDIA Inference Microservices) and makes them available via API. Their offerings include fine-tuned variants of Meta Llama and other open-weight models optimized for NVIDIA hardware. NVIDIA also produces their own models like Nemotron, designed for enterprise workloads on NVIDIA GPU infrastructure.

Model	Released	Context	Input $/1M	Output $/1M	Modalities
NVIDIA: Nemotron 3 Ultra	Jun 2026	1M	$0.500	$2.50	Text
NVIDIA: Nemotron 3 Super	Mar 2026	1M	$0.090	$0.450	Text
NVIDIA: Nemotron 3 Nano 30B A3B	Dec 2025	262K	$0.050	$0.200	Text
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5	Oct 2025	131K	$0.100	$0.400	Open Source
NVIDIA: Nemotron Nano 9B V2	Sep 2025	131K	$0.040	$0.160	Text
NVIDIA: Nemotron 3.5 Content Safety (free)	Jun 2026	128K	Free	Free	Vision, Free
NVIDIA: Nemotron 3 Ultra (free)	Jun 2026	1M	Free	Free	Free
NVIDIA: Nemotron 3 Nano Omni (free)	Apr 2026	256K	Free	Free	Vision, Audio, Free
NVIDIA: Nemotron 3 Super (free)	Mar 2026	1M	Free	Free	Free
NVIDIA: Nemotron 3 Nano 30B A3B (free)	Dec 2025	256K	Free	Free	Free
NVIDIA: Nemotron Nano 12B 2 VL (free)	Oct 2025	128K	Free	Free	Vision, Free
NVIDIA: Nemotron Nano 9B V2 (free)	Sep 2025	128K	Free	Free	Free