# NVIDIA GeForce RTX 5090 32GB

**The Speed Pick — 32GB, fastest tokens** — DealCoconut researched pick in AI Inference Hardware
Canonical page: https://www.dealcoconut.com/p/rtx5090.html

- Current price: $4,299.99 (MSRP $1,999.99)
- Known low: $2,400.00 (supply waves)
- Buy timing: **Wait if you can** — Current street is 2× MSRP. Below $2,500 has happened in supply waves — set an alert; at $4,300, the Spark or a used-3090 pair is better math.
- Reliability: **4/5** — Blackwell silicon is mature; the known risk is the 575W 12V-2x6 power connector — use the native ATX 3.1 cable, never adapters, and seat it fully. Board-partner quality varies; favor brands with strong RMA reputations.
- Price note: verified 2026-07-02 (Newegg) — the $1,999 MSRP is a paper number; $2,900–4,300 is reality · auto-checked 2026-07-03

**Why this pick:** The fastest local inference money buys short of datacenter hardware: 1,792 GB/s of memory bandwidth — 1.9× a 3090 — runs 30B-class models (Qwen, Llama, Mistral) at a fluid 40–55 tokens/sec, fully in VRAM at Q4. If your models fit in 32GB, nothing consumer comes close.

**What it beat:** Workstation cards (RTX 6000-class: 2–3× the price for certified drivers and density, not faster tokens) and the RTX 4090 (when found new, similar street money for 78% of the bandwidth and 24GB).

**Cheaper alternative:** The used-3090 pick below, or AMD's RX 7900 XTX 24GB (~$900 new): llama.cpp runs it well via Vulkan/ROCm at 960 GB/s — the best non-NVIDIA value if you'll tolerate setup friction and occasional tooling gaps.

## Common questions
- What fits in 32GB? — Up to ~32B at Q4 with full context comfortably. 70B does NOT fit — partial CPU offload drops speed to single digits; that's the capacity pick's job.
- PSU and power? — 1000W+ ATX 3.1 for one card (575W + spikes). Our PC build's 750W does not cut it — see the add-on.
- Adding a second GPU later? — Consumer boards split to PCIe 5.0 x8/x8: fine for inference (tensor-parallel traffic is modest; it's training that hates thin lanes). Mind case airflow and a 1500W ceiling.
- Intel/AMD instead? — 7900 XTX is the real alternative (above). Intel Arc is budget-tier only for small models; the software (IPEX-LLM) works but trails CUDA tooling.

## Where to buy (direct links)
- Newegg: $4,299.99 — https://www.newegg.com/gigabyte-gv-n5090gaming-oc-32gd-geforce-rtx-5090-32gb-graphics-card-triple-fans/p/N82E16814932761 ← best price
- Amazon: check price — https://www.amazon.com/s?k=rtx+5090
- Best Buy: check price — https://www.bestbuy.com/site/searchpage.jsp?st=rtx+5090

## Pairs with
- 1000W ATX 3.1 PSU (Corsair RM1000e class) ($179.99): Non-negotiable companion: 575W card + transient spikes need 1000W+ and a native 12V-2x6 cable. — https://www.amazon.com/s?k=corsair+rm1000e+atx+3.1

_Data updated 2026-07-03. Structured data: https://www.dealcoconut.com/builds.json — attribution: DealCoconut (https://www.dealcoconut.com)._
