AI Infrastructure Index — GPU Specs, Cloud Pricing & Benchmarks

The open-source reference for AI hardware. Verified GPU specifications, cloud GPU pricing from 12 providers updated hourly, MLPerf inference benchmarks, training cost estimates, and procurement decision frameworks.

12
Cloud Providers
57+
GPU SKUs Tracked
Hourly
Pricing Updates
MIT
License

Last updated:

What Is the Current H100 GPU Cloud Price?

Live cloud GPU pricing across 12 providers — on-demand and spot rates, USD per GPU-hour. H100 SXM 80GB range: $1.87–$6.15/hr as of March 2026.

Answer: H100 Pricing Quick Reference

The cheapest H100 is Vast.ai at $1.87/hr (marketplace, variable quality). The most reliable specialist cloud is RunPod at $2.49/hr or Lambda Labs at $2.99/hr. AWS P5 costs $3.93/hr. CoreWeave costs $6.15/hr but offers enterprise SLAs.

ProviderGPUOn-Demand $/hrSpot $/hrMin GPUsType
Vast.aiH100 SXM 80GB$1.87–$3.501Marketplace
RunPodH100 SXM 80GB$2.49$1.891On-Demand
Lambda LabsH100 SXM 80GB$2.998On-Demand
Google Cloud (A3)H100 SXM 80GB$3.67$2.258Hyperscaler
AWS (P5)H100 SXM 80GB$3.93$2.508Hyperscaler
Azure ND H100 v5H100 SXM 80GB$3.50–$5.008Hyperscaler
CoreWeaveH100 SXM 80GB$6.158GPU-Native

Source: cloud-pricing.json | Full pricing comparison →

What Are the Best GPUs for AI Training in 2026?

The leading AI training GPUs are NVIDIA B200, H200, H100, AMD MI300X, and Intel Gaudi 3.

GPUMemoryBandwidthFP16 TFLOPSFP8 TFLOPSTDPInterconnect
NVIDIA B200 Newest192 GB HBM3e8.0 TB/s4,5009,0001,000WNVLink 5.0
NVIDIA H200141 GB HBM3e4.8 TB/s1,9793,958700WNVLink 4.0
NVIDIA H100 SXM Most Available80 GB HBM33.35 TB/s9891,979700WNVLink 4.0
NVIDIA A100 SXM80 GB HBM2e2.0 TB/s312400WNVLink 3.0
AMD MI300X192 GB HBM35.3 TB/s1,3072,614750WInfinity Fabric
Intel Gaudi 3128 GB HBM2e3.7 TB/s1,8353,670900WRoCE v2

Full GPU specifications →

Specifications & Reference Guides

Complete technical references for every aspect of AI infrastructure.

GPU Specifications

Full spec sheets for H100, H200, B200, A100, MI300X, MI325X, Gaudi 3.

View GPU Specs →

Cloud GPU Pricing

Per-GPU-hour pricing from 12 providers: AWS, GCP, Azure, CoreWeave, Lambda, RunPod, Vast.ai, and more.

Compare Prices →

AI Accelerators

Google TPU v5p/v5e/v4, AWS Trainium2, Inferentia2, Cerebras WSE-3, and Groq LPU specifications.

View Accelerators →

Inference Benchmarks

MLPerf v4.1 results, tokens/second for Llama 2 70B, GPT-J 6B across H100, H200, A100, Gaudi 2, TPU.

View Benchmarks →

Model GPU Sizing Guide

How much VRAM does your model need? Sizing for LLaMA 3, Mixtral, GPT-4 across FP16, INT8, INT4.

Size Your Model →

Networking & Interconnects

NVLink 1.0–5.0, NVSwitch generations, InfiniBand HDR/NDR/XDR, RoCE, PCIe 3–6.

View Networking →

Training Costs

Training cost estimates for GPT-3, LLaMA 2/3, Mistral, DeepSeek. Cost calculator formula.

Estimate Costs →

GPU Cost Optimization

How to reduce GPU cloud costs 30–90%: right-sizing, quantization, spot instances, reserved pricing.

Optimize Costs →

Buy vs Rent Framework

Cloud vs on-premise vs colocation economics. TCO break-even analysis and decision matrix.

Make the Decision →

Frequently Asked Questions

Direct answers to the most common AI infrastructure questions.

What is the cheapest cloud provider for H100 GPUs?

The cheapest H100 cloud providers as of March 2026 are Vast.ai at $1.87–$3.50/hr (marketplace), RunPod at $2.49/hr, and Lambda Labs at $2.99/hr. Hyperscalers cost more: AWS P5 is $3.93/hr, GCP A3 is $3.67/hr.

How much VRAM does LLaMA 3 70B need?

LLaMA 3 70B requires approximately 140 GB VRAM for FP16 inference, plus ~20% overhead = ~168 GB total. Minimum: 2x H100 80GB or 1x H200 141GB. With INT4 (AWQ), fits on a single H100 80GB using ~35 GB VRAM.

How does NVIDIA B200 compare to H100?

NVIDIA B200 delivers 4.5x higher FP16 TFLOPS (4,500 vs 989), 2.4x more memory (192 vs 80 GB HBM), 2.4x higher memory bandwidth (8.0 vs 3.35 TB/s), and NVLink 5.0 (1,800 GB/s vs 900 GB/s).

What is the cost to train LLaMA 3 70B?

Meta trained LLaMA 3 70B using approximately 16,000 H100 GPUs for 6.4 million GPU-hours, estimated at ~$7.7 million total compute cost. QLoRA fine-tuning on 1M examples costs as little as $3,500 using 4x H100.

Should I buy or rent GPUs for AI?

Rent (cloud) when GPU utilization is below 50%, planning horizon is under 12 months, or workloads are variable. Buy (on-prem) when utilization exceeds 70%, planning horizon is 24+ months. An 8-GPU H100 node costs ~$12,000/month on-prem vs ~$57,000/month on-demand cloud.