Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia's supply chain moat
Most important take away
Jensen Huang argues that Nvidia’s durable advantage is not any single chip but the full five-layer AI stack — CUDA’s programmability, a massive installed base, a co-designed supply chain, and yearly architectural leaps that outpace Moore’s Law — and that conceding the Chinese market through export controls will hand global AI standards to Huawei rather than keep America ahead. He is unusually direct that bottlenecks (CoWoS, HBM, EUV, fabs) are 2–3 year problems solvable with demand signals, while energy policy and a “doomer” culture that discourages people from becoming software engineers, radiologists, plumbers and electricians are the real constraints on American AI leadership.
Summary
Key themes
- Nvidia’s moat is the whole stack, not just GPUs. Huang reframes Nvidia as the “electrons to tokens” company. CUDA’s programmability, a several-hundred-million-GPU installed base, presence in every major cloud, a rich ecosystem of frameworks (Triton, vLLM, SGLang, Verl, Nemo RL), and deep co-design across processor, system, fabric, libraries and algorithms is what lets Blackwell deliver roughly 50x Hopper’s efficiency — far more than the ~75% transistor gain Moore’s Law gave them.
- Supply chain as strategy. Nvidia’s reported ~$100B (and reportedly trending toward ~$250B) of purchase commitments across TSMC logic, HBM (SK Hynix, Micron, Samsung), CoWoS packaging, silicon photonics (Lumentum, Coherent) and ODMs is an explicit moat. Huang spends heavy keynote time “educating” upstream CEOs so they pre-invest in capacity. He argues every bottleneck (CoWoS, HBM, EUV, even fabs) is a 2–3 year problem solvable with a demand signal — plumbers and electricians are the actual hard scaling constraint.
- TPUs and ASICs are not the threat people think. Huang frames Anthropic’s multi-gigawatt Google/Broadcom TPU deal as “a unique instance, not a trend” — without Anthropic there would be essentially no TPU training growth. ASIC margins (~65%) are barely below Nvidia’s ~70%, so the cost case is weaker than it looks, and Nvidia’s versatility, programmability and installed base make it the rational default for the tens of thousands of AI startups who need portable code.
- Build vs. partner philosophy. Nvidia’s rule is “do as much as necessary, as little as possible.” That is why Nvidia backs neoclouds like CoreWeave, Nebius and Lambda but refuses to become a hyperscaler itself, and why it now invests directly in OpenAI (reported up to $30B) and Anthropic (reported ~$10B) only because the scale of capital required exceeds what VCs can provide. Huang explicitly calls it a past mistake that Nvidia was not in a position to make those investments earlier when Anthropic needed them.
- Why customers buying their own kernels doesn’t kill Nvidia. Hyperscalers can and do write their own kernels, but Nvidia embeds large engineering teams inside frontier labs and still routinely extracts 2–3x additional performance from customer stacks. The “F1 vs. Cadillac” analogy: anyone can drive a GPU at 100 mph, but only Nvidia’s engineers push it to the limit — and that is where install base, perf-per-watt, perf-per-dollar and the lowest-cost-token story compound into the flywheel.
- The China argument (the longest section). Huang forcefully opposes broad export controls. His argument: China already has abundant energy, 60% of mainstream chip manufacturing, 50% of the world’s AI researchers, and Huawei just had its biggest year ever shipping millions of 7nm parts; algorithmic progress (MoE, new attention mechanisms, DeepSeek) matters more than node shrink; ceding the world’s second-largest compute market forces Chinese models to optimize on Huawei silicon, which then diffuses globally as the default standard to the Global South, Middle East, India and Africa. He draws an analogy to how protectionism hollowed out the US telecom equipment industry. He keeps conceding the point that the US should stay ahead and should ship the best tech first at home — but flatly rejects “all or nothing” framing.
- Energy and “doomerism” are the real bottlenecks. Huang is most agitated about two downstream risks: US energy policy choking re-industrialization, and a cultural message that tells kids not to become software engineers, radiologists, plumbers or electricians because AI will take the job. He is explicit that “job vs. task” confusion (radiology is patient care, not reading scans) is producing real shortages.
- Premium-token inference is a new market shape. Nvidia added Rocm-style support for a faster-response inference segment because the ASP of tokens has risen enough that low-throughput, low-latency inference is now economically worth running — expect an increasingly segmented inference market with very different price tiers.
- Nvidia without AI would still be huge. Huang says even absent deep learning, Nvidia’s accelerated-computing thesis — molecular dynamics, seismic, computational lithography (cuLitho), quantum chemistry, data frames, particle physics — would still make it a very large company because general-purpose CPU scaling has run out.
Actionable insights
- For builders / engineers: Build on CUDA first. Huang’s pitch is that the install base, ecosystem richness (Triton, vLLM, SGLang, RL stacks) and cross-cloud portability mean your code will run everywhere — and “when something breaks, you want it to be your bug, not the mountain of code underneath.” This is a real recommendation for anyone choosing a first target stack.
- For operators / infra buyers: Run benchmarks, specifically InferenceMax and MLPerf, before accepting ASIC vendor TCO claims. Huang challenges TPU/ASIC vendors by name to show up on public benchmarks and says they don’t — a concrete due-diligence step for anyone weighing a non-Nvidia deployment.
- For sponsors in the episode: Huang’s own sponsor read highlights Crusoe claiming up to 10x faster time-to-first-token and 5x throughput vs. vLLM via cross-user KV-cache sharing — worth testing if you run agent workloads with shared system prompts.
- Career advice (explicit and repeated):
- Do not be scared out of software engineering. Huang is emphatic that the number of agents and tool users will grow exponentially; synopsys, cadence, design tools, and the engineers who wield them will be amplified, not replaced.
- Do not be scared out of radiology. He cites the decade-old “don’t be a radiologist” predictions as the canonical example of misunderstanding job vs. task — the world is short of radiologists.
- Plumbers and electricians are the actual binding constraint on the AI buildout. Treated half-jokingly but he is serious: these are durable, well-compensated, AI-proof careers powering the AI factories themselves.
- Become an operator of AI systems, not just a user. Nvidia’s systems are designed to be operated by third parties — the value is going to people who can stand up and run fleets.
- On placing orders with Nvidia: Huang says allocation is genuinely first-come-first-served based on a real purchase order plus data-center readiness. Nvidia does not raise prices on scarcity (“we want to be dependable”). If you want allocation, file a PO and get your site ready — forecasting and relationship-building substitute for haggling.
- Read the supply-chain tea leaves: Watch Nvidia’s purchase-commitment footnote (the ~$100B → ~$250B line), CoWoS capacity announcements, and silicon-photonics partners (Lumentum, Coherent) as leading indicators of how big Nvidia thinks the next 2–3 years really are.
Stocks, investments and companies mentioned
This is an investing-relevant episode. Direct references:
- NVIDIA (NVDA) — central subject; Huang defends ~70% gross margins, claims yearly cadence (Blackwell → Vera Rubin → Vera Rubin Ultra → Feynman → unnamed next), says every year customers can “count on it like a clock.” Implicit bull case: supply-chain moat plus CUDA ecosystem plus yearly arch leaps.
- TSMC (TSM) — Nvidia is the biggest customer on N3, reportedly 60% of N2 this year and ~86% next year; Huang emphasizes 30-year trust-based relationship with no legal contract. Critical supplier risk/leverage.
- SK Hynix, Micron (MU), Samsung — HBM suppliers; Huang singles out Micron’s Sanjay Mehrotra for early belief in the HBM/LPDDR thesis.
- ASML (ASML) — EUV, referenced as the “harder than plumbers” bottleneck Nvidia indirectly drives via TSMC demand signals.
- Lumentum (LITE) and Coherent (COHR) — Nvidia has explicitly “reshaped” silicon-photonics supply chain around these vendors; notable for investors tracking the optical/photonics layer.
- Cadence (CDNS) and Synopsys (SNPS) — Huang explicitly predicts “the number of instances of Synopsys design compiler is going to skyrocket” as agents become tool users. This is as close to a direct bullish call on EDA software as he gives.
- Broadcom (AVGO) / Google (GOOGL) TPU — framed as Anthropic-specific, not a structural trend. Huang is implicitly bearish on the “TPUs everywhere” narrative.
- AMD — mentioned re: OpenAI’s announced deal, framed as a small hedge while OpenAI remains “vastly on Nvidia.”
- Huawei — described as a real and rising threat in Chinese domestic compute, “a networking company” with millions of 910C-class chips shipped; Huang’s whole China argument is that Huawei benefits most from US export controls.
- Nvidia investments and backstops: reported up to $30B in OpenAI, ~$10B in Anthropic, and a reported $6.3B backstop plus $2B equity in CoreWeave; also investments in Nebius and Lambda. Huang calls financing “not the business” but necessary when ecosystem scale demands it.
- Neoclouds worth tracking: CoreWeave, Nebius, Lambda, Crusoe — all explicitly credited by Huang as existing because Nvidia backed them.
- Crusoe (sponsor read) — not public but flagged as an early Vera Rubin deployment partner with cross-user KV-cache.
- Tesla — used only as an analogy (Tesla sold good EVs into China and China still built its own; Huang argues compute ecosystems are far stickier than cars).
- Eli Lilly (LLY) — brief name-check as an example of an enterprise running its own Nvidia supercomputer for drug discovery.
- Boeing (BA) — only as a rhetorical analogy Huang rejects (“calling AI compute enriched uranium is illogical”).
Actionable investment takeaways Huang is explicitly suggesting:
- The supply-chain moat narrative (TSMC + HBM + CoWoS + silicon photonics pre-commitments) is, in his own framing, Nvidia’s most under-appreciated advantage — watch the purchase-commitment disclosures as the key metric.
- Bearish read on the TPU/ASIC-replaces-Nvidia thesis: “without Anthropic there would be no TPU training growth.” Investors extrapolating from the Anthropic deal to structural TPU dominance are, in Huang’s view, wrong.
- Bullish read on EDA tool vendors (Synopsys, Cadence): Huang argues agentic tool use will cause the number of tool instances to “skyrocket.” This is a specific, named sector call.
- The silicon-photonics supply chain (Lumentum, Coherent and peers) is where Nvidia has been quietly pre-investing — a second-order beneficiary tree.
- US energy policy and grid buildout are Huang’s repeatedly named downstream bottleneck — power, transformers, utilities and re-industrialization plays tie directly into whether his thesis can be realized.
- Premium-latency inference (“very high ASP tokens”) is opening a new pricing segment — relevant to inference-optimized infra plays and to how to model token-economics of inference clouds.
Chapter Summaries
1. Will Nvidia get commoditized? (Opening)
Dwarkesh raises the bear case: Nvidia just ships a GDS2 file to TSMC, everything else is manufactured by partners, so software can commoditize the whole thing. Huang reframes Nvidia as the “electrons to tokens” company and argues the transformation itself — an ecosystem spanning all five layers of AI — is far from commoditized. He also predicts tool-maker software companies (Synopsys, Cadence, Excel-class tools) will explode as agents become the primary users.
2. The supply chain moat
Discussion of Nvidia’s reported ~$100B of purchase commitments (and ~$250B per SemiAnalysis reporting). Huang explains that Nvidia’s downstream reach is what convinces CEOs to pre-build upstream capacity, and that GTC exists partly to physically bring the upstream and downstream into the same room. He calls the moat “one of the things we can do that is hard for someone else to do.”
3. Can upstream keep up with 2x revenue per year?
Dwarkesh pushes: Nvidia is already 60% of N3 and ~86% of N2. Huang’s answer: every bottleneck is a 2–3 year problem if you supply a demand signal. CoWoS was once specialty, now mainstream. Nvidia has pre-built silicon-photonics supply chain via Lumentum, Coherent and TSMC partnership, and licensed key patents back into the ecosystem. The real, slow bottlenecks are energy, plumbers and electricians — not logic, memory, packaging or EUV.
4. Doomerism and jobs
Huang pivots to an impassioned critique of AI doomers discouraging people from becoming software engineers and radiologists. He invokes the decade-old “don’t be a radiologist” predictions as the canonical failure mode — the world is now short of radiologists. He argues energy, plumbers and electricians are the real AI-buildout bottlenecks.
5. TPUs and Anthropic
Response to Dwarkesh pointing out that two of the top three frontier models (Claude, Gemini) train on TPU. Huang argues accelerated computing is much broader than tensor processing — molecular dynamics, QCD, data frames, fluid dynamics — and Nvidia is in every cloud because its systems are designed to be operated by third parties. On Anthropic’s multi-gigawatt Broadcom/Google deal: “Anthropic is a unique instance and not a trend. Without Anthropic, why would there be any TPU growth at all?“
6. Programmability vs. systolic arrays
Dwarkesh raises the technical case for TPUs: AI is mostly matmul, so why spend die area on flexibility? Huang’s answer: matmul is important but not the whole story. New attention mechanisms, SSM/transformer hybrids, MoE, diffusion+autoregressive fusion — the ability to invent new algorithms is the 10–100x lever. Blackwell is ~50x Hopper not from Moore’s Law (which gave ~75%) but from extreme co-design across processor, fabric, libraries and algorithms enabled by CUDA.
7. Do hyperscalers still need CUDA?
Dwarkesh: hyperscalers can write their own kernels, Triton exists, OpenAI has its own stack. Huang’s answer: (1) CUDA is the richest, most wrung-out foundation; (2) install base matters above all to developers; (3) Nvidia is in every cloud. Nvidia’s own engineers routinely extract 2–3x additional performance from customer stacks — the “F1 racer vs. Cadillac” analogy. He challenges TPU/ASIC vendors to show up on InferenceMax and MLPerf.
8. Financing AI labs and why Nvidia isn’t a hyperscaler
Discussion of Nvidia’s investments in OpenAI (reported up to $30B), Anthropic (~$10B), and its backstop of CoreWeave (reported $6.3B, $2B equity). Huang admits Nvidia’s past mistake was not recognizing that frontier labs could not be funded through VCs alone and that Nvidia was not in a position to make multi-billion-dollar investments back when Anthropic needed one. He explains the “do as much as needed, as little as possible” philosophy and why Nvidia backs neoclouds (CoreWeave, Nebius, Lambda) but refuses to become a hyperscaler.
9. Allocation and how to get GPUs from Nvidia
Huang denies that Nvidia picks winners in allocation. The real gating factors are (1) a placed purchase order and (2) data-center readiness. Nvidia never raises prices on scarcity. He dismisses the Larry Ellison / Elon Musk “begging for GPUs” story as a fabrication and compares the relationship with TSMC (30 years, no legal contract) to the kind of trust Nvidia wants to be for its customers.
10. China export controls (the long debate)
The longest and most contentious section. Dwarkesh raises Anthropic’s Claude “Mythos” announcement about cyber-offensive capabilities as a reason to restrict China. Huang counters repeatedly that (a) China already has abundant compute, energy and researchers, (b) algorithmic advances (MoE, DeepSeek) matter more than node shrink, (c) export controls forced Huawei to become a real competitor, (d) ceding the second-largest market means Chinese models will be optimized for Huawei silicon and diffuse globally that way, and (e) Huang draws the US telecom-equipment industry as a cautionary tale. He concedes the US should stay ahead and should get the best chips first, but rejects “all or nothing” framing. He also argues that open-source AI safety ecosystems (which he says are heavily Chinese) must stay interoperable with the American tech stack.
11. Architecture choices and process nodes
Huang explains why Nvidia doesn’t run multiple parallel architectures (Cerebras wafer-scale, Dojo-style packaging, non-CUDA) — they simulate them and they’re “provably worse.” Nvidia recently added ROCm support and expanded the Pareto frontier for a low-latency, high-ASP inference segment because premium tokens now exist as a real market. He says Nvidia would happily go back to 7nm if capacity at leading edge ran out, but the R&D cost of re-backporting is prohibitive.
12. Counterfactual: Nvidia without deep learning
Huang says even without AI, Nvidia would still be “very, very large” — computational lithography (cuLitho), quantum chemistry, seismic processing, molecular dynamics, data-frame processing and structured data all benefit from accelerated computing. The core thesis is that general-purpose CPU scaling has run its course, and domain-specific acceleration is the only path forward. AI is the biggest application, but not the only one.