Dylan Patel - The Infinite Demand for Tokens, Claude Mythos, and Supply Constraints
Most important take away
Demand for frontier AI tokens is effectively unbounded and growing exponentially: Semi-Analysis itself went from tens of thousands to a $7M annual run rate on Claude in under a year (~25% of salary spend). Supply across memory, logic, optics, and fab equipment cannot catch up before 2027-2028, so frontier model access will concentrate among those with capital and lab relationships. Operators who don’t aggressively use tokens to generate and capture outsized value risk being commoditized or locked into a “permanent underclass.”
Chapter Summaries
- Semi-Analysis’s token explosion: Firm spend went from ~$10K to $7M run rate in a year driven by Claude Code usage even among non-engineers; one person rebuilt work that previously required full teams (chip reverse-engineering lab, full US power grid mapping, 2000-task economist benchmark).
- Commoditization pressure: If you don’t adopt aggressively, competitors using AI will undercut you. Investment firms will buy data rather than build it, but edge goes to those who move fastest.
- Token demand is unbounded: Anthropic grew from $9B to ~$40B revenue with near-flat compute; margins floored at 72%. Willingness to pay for the newest frontier model is extreme; older models become irrelevant despite massive cost declines.
- Mythos and lab concentration: Anthropic’s “Mythos” is the biggest capability jump in two years; Anthropic is selectively releasing it (e.g., to cyber customers only). Expect frontier models to be distributed to fewer customers at higher prices.
- Implementation is now cheap; ideas are the constraint: Economic reordering where picking the right idea, selling it, and attracting capital matter more than execution skill.
- Robotics is next: VLAs are too data-inefficient, but software singularity will enable few-shot pre-trained robot models in 6-18 months, further expanding token demand.
- OpenAI vs. Anthropic: Anthropic is compute-constrained; OpenAI has raised massively for compute (Oracle, CoreWeave, SoftBank, Microsoft, Trainium). Even Tier-2 and Tier-3 labs will sell out of tokens.
- Supply bottlenecks: H100 prices rising, GPU useful life extending to 7-8 years. Memory (DRAM especially) will double or triple in price; real capacity doesn’t arrive until late 2027/2028. TSMC capex heading toward $100B by 2028. CPUs sold out due to RL environments and deployed-app serving. Copper foil, glass fibers, lasers, optics all tight.
- Public perception risk: Patel predicts large-scale anti-AI protests within months; lab CEOs need to stop doing interviews and stop hyping future capabilities.
Summary
Actionable insights
- Get an enterprise Anthropic contract and pay per token rather than via subscription if you have capital — it minimizes rate limits and secures access to newest models. Relationships with lab reps determine rate-limit increases.
- Always use the newest frontier model. Mythos is more expensive per token but uses fewer tokens per task, making it net-cheaper than Opus 4.6 for most work. Older-tier models are irrelevant for value creation even as they get 100-1000x cheaper.
- Three-part mandate to avoid the “permanent underclass”: (1) use more tokens, (2) generate economic value from them, (3) capture that value. The lazy path is working one hour instead of eight; the winning path is working eight hours and producing 8x output.
- Pick ideas, don’t execute. Implementation cost has collapsed. The scarce skill is choosing which ideas justify the token spend, then selling the result and attracting capital.
- Expect model-access concentration. Imagine a Ken Griffin-style deal where one firm prepays $10M for first access to each new model. Relationship capital with labs becomes a moat.
- Robotics window: 6-18 months. Few-shot pre-trained robot models are coming; niche rental/service robot models will proliferate.
Company-specific information
- Anthropic: ~$40B revenue run rate, 72%+ gross margins (earlier leak showed 30-something percent, now expanded dramatically). Compute-constrained. Holding back Mythos from general release; only select cyber customers have it. Opus 4.7 just launched.
- OpenAI: Aggressively buying compute from Oracle, CoreWeave, SoftBank, Microsoft, and now Amazon Trainium. Taking a more gradual scaling approach. Perceived as “behind” right now but will catch up and serve the next tier of demand at 50% margins.
- Semi-Analysis (Patel’s firm): ~$25M salary spend, $7M/year on Claude Code, growing fast. Built energy data business (US grid mapping, power plant/transmission line data) in weeks, competing with incumbents that took a decade. Sells data to hedge funds including Citadel, Shaw.
- NVIDIA: ~75% gross margins holding; making large prepayments upstream.
- TSMC: $56-57B capex in 2026, could hit $100B by 2028; raising prices only single digits despite being fully sold out.
- Memory (DRAM/NAND): Low double-digit percent capacity growth per year; true incremental supply doesn’t arrive until late 2027/2028. DRAM prices expected to double or triple again.
- ASML, LAM Research, Applied Materials, MKSI: Downstream equipment supply chain set to see amplified demand from TSMC capex tail whip.
- FPGAs: 120 per next-gen AI rack — an underappreciated demand vector.
- CPUs: Sold out, needed for RL environments and deployed-app serving.
Career advice
- Be a hustler: work the full day but 8x your output with AI, don’t coast.
- Information businesses that don’t aggressively adopt AI will be commoditized by those that do, including their own customers.
- Value creation > value capture; focus on all three: generating tokens, creating value, and capturing that value commercially.
- Choose ideas and direction — that’s the new scarce skill. Selling and capital formation around AI-generated output become critical.