Which GPU has the most memory for AI?

The AMD MI325X has the most memory at 256 GB HBM3e. The NVIDIA B200 and AMD MI300X both offer 192 GB HBM3e/HBM3. The NVIDIA H200 has 141 GB HBM3e, compared to the H100's 80 GB HBM3.

How many H100 GPUs does LLaMA 3 70B need?

LLaMA 3 70B requires approximately 140 GB VRAM for FP16 inference, meaning it needs a minimum of 2x H100 80GB (160 GB total). With INT4 quantization (AWQ), it fits on a single H100 80GB using approximately 35 GB VRAM.

What is NVIDIA B200 performance vs H100?

NVIDIA B200 delivers 4,500 TFLOPS FP16 (vs H100's 989 TFLOPS - 4.5x improvement), 9,000 TFLOPS FP8 (vs 3,958 - 2.3x), 8 TB/s memory bandwidth (vs 3.35 TB/s - 2.4x), and 192 GB HBM3e memory (vs 80 GB - 2.4x). TDP is 1,000W vs 700W.

AI Infrastructure Index — GPU Specs, Cloud Pricing & Benchmarks

Q: What is the cheapest cloud provider for H100 GPUs?

The cheapest H100 GPU cloud providers as of March 2026 are Vast.ai at $1.87-$3.50/hr on the marketplace, RunPod at $2.49/hr, and Lambda Labs at $2.99/hr. AWS (P5) costs $3.93/hr and Azure ND H100 v5 costs $3.50-$5.00/hr.

The open-source reference for AI hardware. Verified GPU specifications, cloud GPU pricing from 12 providers updated hourly, MLPerf inference benchmarks, training cost estimates, and procurement decision frameworks.

Cloud Providers

57+

GPU SKUs Tracked

Hourly

Pricing Updates

MIT

License

Last updated: March 1, 2026

What Is the Current H100 GPU Cloud Price?

Live cloud GPU pricing across 12 providers — on-demand and spot rates, USD per GPU-hour. H100 SXM 80GB range: $1.87–$6.15/hr as of March 2026.

Answer: H100 Pricing Quick Reference

The cheapest H100 is Vast.ai at $1.87/hr (marketplace, variable quality). The most reliable specialist cloud is RunPod at $2.49/hr or Lambda Labs at $2.99/hr. AWS P5 costs $3.93/hr. CoreWeave costs $6.15/hr but offers enterprise SLAs.

Provider	GPU	On-Demand $/hr	Spot $/hr	Min GPUs	Type
Vast.ai	H100 SXM 80GB	$1.87–$3.50	—	1	Marketplace
RunPod	H100 SXM 80GB	$2.49	$1.89	1	On-Demand
Lambda Labs	H100 SXM 80GB	$2.99	—	8	On-Demand
Google Cloud (A3)	H100 SXM 80GB	$3.67	$2.25	8	Hyperscaler
AWS (P5)	H100 SXM 80GB	$3.93	$2.50	8	Hyperscaler
Azure ND H100 v5	H100 SXM 80GB	$3.50–$5.00	—	8	Hyperscaler
CoreWeave	H100 SXM 80GB	$6.15	—	8	GPU-Native

Source: cloud-pricing.json | Full pricing comparison →

What Are the Best GPUs for AI Training in 2026?

The leading AI training GPUs are NVIDIA B200, H200, H100, AMD MI300X, and Intel Gaudi 3.

GPU	Memory	Bandwidth	FP16 TFLOPS	FP8 TFLOPS	TDP	Interconnect
NVIDIA B200 Newest	192 GB HBM3e	8.0 TB/s	4,500	9,000	1,000W	NVLink 5.0
NVIDIA H200	141 GB HBM3e	4.8 TB/s	1,979	3,958	700W	NVLink 4.0
NVIDIA H100 SXM Most Available	80 GB HBM3	3.35 TB/s	989	1,979	700W	NVLink 4.0
NVIDIA A100 SXM	80 GB HBM2e	2.0 TB/s	312	—	400W	NVLink 3.0
AMD MI300X	192 GB HBM3	5.3 TB/s	1,307	2,614	750W	Infinity Fabric
Intel Gaudi 3	128 GB HBM2e	3.7 TB/s	1,835	3,670	900W	RoCE v2

Full GPU specifications →

Specifications & Reference Guides

Complete technical references for every aspect of AI infrastructure.

GPU Specifications

Full spec sheets for H100, H200, B200, A100, MI300X, MI325X, Gaudi 3.

View GPU Specs →

Cloud GPU Pricing

Per-GPU-hour pricing from 12 providers: AWS, GCP, Azure, CoreWeave, Lambda, RunPod, Vast.ai, and more.

Compare Prices →

AI Accelerators

Google TPU v5p/v5e/v4, AWS Trainium2, Inferentia2, Cerebras WSE-3, and Groq LPU specifications.

View Accelerators →

Inference Benchmarks

MLPerf v4.1 results, tokens/second for Llama 2 70B, GPT-J 6B across H100, H200, A100, Gaudi 2, TPU.

View Benchmarks →

Model GPU Sizing Guide

How much VRAM does your model need? Sizing for LLaMA 3, Mixtral, GPT-4 across FP16, INT8, INT4.

Size Your Model →

Networking & Interconnects

NVLink 1.0–5.0, NVSwitch generations, InfiniBand HDR/NDR/XDR, RoCE, PCIe 3–6.

View Networking →

Training Costs

Training cost estimates for GPT-3, LLaMA 2/3, Mistral, DeepSeek. Cost calculator formula.

Estimate Costs →

GPU Cost Optimization

How to reduce GPU cloud costs 30–90%: right-sizing, quantization, spot instances, reserved pricing.

Optimize Costs →

Buy vs Rent Framework

Cloud vs on-premise vs colocation economics. TCO break-even analysis and decision matrix.

Make the Decision →

Frequently Asked Questions

Direct answers to the most common AI infrastructure questions.

What is the cheapest cloud provider for H100 GPUs?

The cheapest H100 cloud providers as of March 2026 are Vast.ai at $1.87–$3.50/hr (marketplace), RunPod at $2.49/hr, and Lambda Labs at $2.99/hr. Hyperscalers cost more: AWS P5 is $3.93/hr, GCP A3 is $3.67/hr.

How much VRAM does LLaMA 3 70B need?

LLaMA 3 70B requires approximately 140 GB VRAM for FP16 inference, plus ~20% overhead = ~168 GB total. Minimum: 2x H100 80GB or 1x H200 141GB. With INT4 (AWQ), fits on a single H100 80GB using ~35 GB VRAM.

How does NVIDIA B200 compare to H100?

NVIDIA B200 delivers 4.5x higher FP16 TFLOPS (4,500 vs 989), 2.4x more memory (192 vs 80 GB HBM), 2.4x higher memory bandwidth (8.0 vs 3.35 TB/s), and NVLink 5.0 (1,800 GB/s vs 900 GB/s).

What is the cost to train LLaMA 3 70B?

Meta trained LLaMA 3 70B using approximately 16,000 H100 GPUs for 6.4 million GPU-hours, estimated at ~$7.7 million total compute cost. QLoRA fine-tuning on 1M examples costs as little as $3,500 using 4x H100.

Should I buy or rent GPUs for AI?

Rent (cloud) when GPU utilization is below 50%, planning horizon is under 12 months, or workloads are variable. Buy (on-prem) when utilization exceeds 70%, planning horizon is 24+ months. An 8-GPU H100 node costs ~$12,000/month on-prem vs ~$57,000/month on-demand cloud.