The Token Economy: AI Infrastructure And The Future Of Compute
Most important take away
The world is compute-starved and AI adoption sits at roughly the 1996-equivalent of internet penetration, meaning we are extremely early in a multi-decade build-out. The biggest investable opportunity lies in re-engineering the data center and semiconductor stack (lithography, optical interconnects, chip area, AI-purposed foundries) away from an IBM-PC-era paradigm toward purpose-built “token factories,” while near-term software value will shift from seat-based SaaS toward wholesale/PTU-style token economics and outcome-priced “service-as-software.”
Chapter Summaries
M12’s Role and Mandate
Michael Stewart describes M12 as Microsoft’s venture arm operating like a traditional VC (seed–Series B, checks up to ~$10M, leading/co-leading). They intentionally do not duplicate Microsoft’s corporate strategy. Example: Armada (modular data centers) was a complementary bet filling a gap in Microsoft’s edge-AI strategy.
Where We Are in the AI Cycle
Just before ChatGPT, the industry was on the verge of another “AI winter.” Generative AI caught most researchers and investors wrong-footed. AI chatbot penetration of smartphone users is in the low 20% range — equivalent to PC internet penetration in early 1996, implying enormous runway.
Open Source, Frontier Models, and the Compute-Starved World
Time-to-open-source for models can be days. Heavy users (agentic coding, voice BPOs) route aggressively to smaller, cheaper models because running frontier models continuously is economically and capacity-constrained. ARK’s view: the world is compute-starved through at least 2034, creating a software arbitrage layer.
Thin Wrappers vs. Model Owners
Debate over whether value accrues to wrappers (Cursor, Windsurf) or model owners (Claude Code, Codex). Stewart argues the UI question may become moot — CLI-driven, multi-session agentic workflows hint that entirely new interfaces will emerge within ~5 years.
The SaaS “Disaster” and Wholesale AI Tokens
Enterprise IT budgets are capped as a percent of revenue, so AI spend cannibalizes legacy SaaS. Stewart introduces “wholesale AI” / “wholesale tokens” — large orgs buying Azure PTUs (Provisioned Throughput Units) in bulk, potentially at 1/10 to 1/20 the per-seat cost, then provisioning internally as a private SaaS.
Seat Pricing vs. Outcome Pricing
Seat-based pricing (e.g., ~$30/month) exists because users are still discovering use cases. Long-term, willingness-to-pay rises dramatically — a coding agent that drives most of a programmer’s output should plausibly cost a large fraction of a ~$250K burdened salary. Charlie Roberts notes businesses historically pay ~10% of software’s productive value; selling “service-as-software” to line managers (marginal hire replacement) captures far more.
Consumer AI, Smart TVs, Material Science
Stewart sees under-invested white space in delightful consumer AI, particularly via smart TVs (he invested in Weekend, an interactive voice experience on smart TVs). He plans to lead a deal in AI for material science in 2026, focused on generating a single actionable lead rather than on humanoid lab robots (he is bearish on the humanoid-scientist thesis).
The Data Center of the Future
Current data centers still descend from the 1980s IBM PC (CPU + DRAM, same voltages, form factors). Scaling this to trillions of dollars is a massive missed opportunity. Key constraints: not enough fabs, not enough lithography tools, not enough advanced-packaging materials. The central bottleneck is chip/reticle area size — forcing die-to-die interconnects that are 10–50x less energy-efficient than on-chip data movement. We need an “AI-purposed foundry” industry distinct from TSMC’s mixed MCU/FPGA/AI-chip model.
The “Whale Oil Era” of AI
Stewart calls today’s approach — bigger packages, taller HBM stacks, copper-to-optical interconnects — the “whale oil era.” True progress requires zooming out, treating data centers as token factories, dropping ISA/backward-compatibility constraints, and using AI itself as an ingredient in lithography and chip design.
Summary
Actionable Insights & Investment Advice
Named Companies / Investments Mentioned:
- Armada — modular data centers; M12 investment; syndicate includes Felicis, Lux, Founders Fund. Partner with Microsoft on edge AI.
- Weekend — interactive voice AI on smart TVs; M12 investment.
- Microsoft (MSFT) — implicit beneficiary via Azure PTUs, M12 portfolio, Copilot.
- NVIDIA (NVDA) — referenced in context of how capital markets underwrite current AI compute (Stewart implies it may be under-appreciated, not over-hyped).
- TSMC — current dominant foundry model; Stewart argues its mixed-product model is ill-suited to an AI-purposed foundry future.
- ASML — admires the lithography tool but argues the reticle size paradigm must change for AI workloads.
- Tesla / Elon Musk — cited as the one actor making the full-stack contrarian bet to reengineer compute infrastructure; a decade-lead risk exists (analogous to reusable rockets).
- Anthropic — referenced regarding seat caps and Pro account abuse.
- Stability AI, Jasper AI — historical context for open-source and early generative AI.
Actionable Insights:
-
Reject the “AI bubble” framing. Stewart and Roberts argue media/capital allocators treating current spend as a 2-year blip are “backwards and wrong.” AI usage is a ratchet that won’t reverse. Opportunity: lean long on compute infrastructure where consensus is hedging.
-
Bet on compute scarcity lasting through 2034. Invest in the arbitrage layer — companies that route intelligently between frontier and smaller/open-source models. Software routing and efficiency tools will capture meaningful value.
-
Watch for “wholesale AI” / PTU adoption. Large enterprises (100K+ employees) will shift from per-seat SaaS pricing to bulk token purchasing. This threatens per-seat AI SaaS economics but benefits hyperscalers (Azure, AWS, GCP) and private-SaaS enablement vendors.
-
Sell AI as service-as-software to line managers, not CIOs. Pricing at ~70% of fully-burdened labor cost captures far more value than the historical 10% of productive-value ceiling. Look for startups that price against outcomes/labor replacement.
-
Hardware / data center re-architecture is the largest open opportunity. Specific sub-themes Stewart is targeting:
- Lithography for larger reticle / die area (reducing interconnect overhead).
- Optical compute and optical interconnects.
- AI-purposed foundry models (vs. mixed-product TSMC model).
- Advanced packaging materials.
- Material science AI that delivers high-quality single leads in verticals tied to data centers.
-
Smart TV AI is an under-invested consumer surface. Edge-to-cloud orchestration with growing local compute makes TVs a natural delivery vector for casual consumer AI (Weekend is the reference investment).
-
Be cautious on humanoid robot scientists. Stewart is bearish on differentiation in humanoid-enabled labs. Prefer AI-in-the-loop on stable, controlled test beds (semiconductor foundries, turbine blades, direct air capture) where the AI itself is the differentiator.
-
Expect emergency/surge AI pricing tiers. Low-latency, high-throughput access (e.g., production outage recovery) will command premium pricing — analogous to real-world emergency services. Opportunity in infrastructure enabling those tiers.
-
Watch for non-Musk players entering full-stack AI fabrication. Stewart expects consortia of interested parties (not just one customer-supplier pair) to drive the AI-purposed foundry build-out. Early capital allocators into such consortia could capture outsized returns.
-
Real-world multimodal data is a genuine gap (Roberts’ point). Investments in data-generation from physical-world AI systems (robotics, sensors) will feed multimodal model improvement — separate from the humanoid-scientist thesis.
Disclaimers from the podcast: ARK is a registered investment advisor; statements are views of ARK or guests, not recommendations to buy, sell, or hold any security. Clients of ARK may hold positions in securities discussed.