Krishna Rao - Anthropic's CFO on Compute, Scaling to $30B ARR, and the Returns to Frontier Intelligence - [Invest Like the Best, EP.471]
Most important take away
Compute is the canvas on which everything in a frontier AI lab is built, and Anthropic’s edge is not just buying more of it but using it fungibly across three chip platforms (AWS Trainium, Google TPUs, NVIDIA GPUs) and three uses (model development, internal acceleration, customer serving). The returns to being at the frontier are extremely high — especially in enterprise — and the business jumped from $9B to $30B run-rate revenue in roughly four months because each new model unlocks new TAM rather than just incrementally improving an existing one.
Summary
Actionable insights and notable company/career signals from the conversation:
Career advice and personal lessons from Krishna Rao
- Break linear thinking. The single biggest professional adjustment he had to make was forcing himself to reason in exponentials and scenarios rather than point estimates. Have a “very low bar for updating priors” — assumptions that were true a month ago may already be obsolete.
- Hire partners, not direct reports. Rao explicitly tells candidates he is hiring them as a partner; he expects disagreement, whiteboarding, and pushback. Talent density beats talent mass.
- Be willing to be both 50,000 feet and 500 feet. He says he is not naturally comfortable staying at 50,000 feet, but in a business with this much surface area you cannot be granular everywhere — that’s why partners matter.
- First-principles thinking + intellectual openness is the survival kit when the paradigm keeps breaking.
- Use the tools yourself. His rule: “If we’re not super users of this, how can you expect customers to do that?” He tracks an internal token-usage leaderboard; the heaviest user on his team is the head of tax, not a 22-year-old engineer.
- Pattern-match across past careers. His Blackstone PE training (granular diligence) and his Airbnb pandemic financing experience both inform how he thinks under uncertainty.
- Carve out a weekly moment of gratitude — once a week he stops and consciously appreciates the seat he is in.
Anthropic — strategy and economics
- Revenue trajectory: ~$250M ARR when Rao joined two years ago → ~$1B at end of 2024 → $9B at start of this year → $30B run rate in the most recent quarter. Net dollar retention >500% annualized. They sell to 9 of the Fortune 10.
- Capital: $75B raised since Rao joined, with another ~$50B coming from Amazon and Google deals already inked. Capital is raised primarily to fund the cone-of-uncertainty compute commitments, not to fund operating losses.
- Compute deals announced/discussed: 5 GW TPU deal with Google + Broadcom starting 2027; up to 5 GW Trainium deal with Amazon; over $100B in total commitments; a new SpaceX/Colossus (Memphis) partnership for near-term compute aimed at consumer/prosumer.
- Three chip platforms used fungibly: AWS Trainium, Google TPUs, NVIDIA GPUs. Anthropic claims to be the only frontier lab running across all three clouds and chip families. They build their own compilers and orchestration layer.
- Pricing philosophy: stability over extraction. They have made very few pricing changes; the biggest was lowering Opus pricing (with Opus 4.5) because the family was underutilized relative to capability — a deliberate Jevons-paradox play that drove far more consumption.
- Margin philosophy: they measure “return on compute spend” across the full envelope (training + internal + serving) rather than per-token variable cost. Returns today are described as robust.
- Model strategy: “frontier or not” matters more than “open or closed.” Each model release unlocks new TAM rather than just incrementally improving the prior one. Capability is multi-dimensional (not a single IQ number) and includes long-horizon task ability, tool use, and agentic execution.
- Recursive self-improvement is real internally: 90%+ of Anthropic’s code is now written by Claude Code, including much of Claude Code itself. Scaling laws are “alive and well” by their measurements.
- Platform vs. application: mostly horizontal/platform (Claude API, prompt caching, virtual machines, Claude Agents SDK, managed agents). They go vertical only where (a) they can build ahead to a model capability the market doesn’t yet believe in (Claude Code is the canonical example), or (b) they want to demonstrate a template (Claude for Financial Services, Life Sciences, Security) — done in partnership with the ecosystem.
Frontier vision — “virtual collaborator”
- The next frontier is not a smarter chatbot; it’s an agent with organizational context, access to your tools, memory, and the ability to work on long-horizon ideas. Coding is the leading indicator. Cowork is growing faster than Claude Code did at the equivalent point in time. Knowledge work is ~$40T globally; that is the TAM they are aiming at.
Specific companies/products mentioned
- Anthropic models: Opus 4, 4.5, 4.6, 4.7, Mythos (the most recent, notable for cyber capability and a phased release); Sonnet 3.5/3.6 (the inflection point in coding); Haiku.
- Anthropic products: Claude Code, Cowork, Claude Agents SDK, managed agents, Claude for Financial Services, Claude for Life Sciences, Claude Security; internal “ant stats” platform and an MFR (monthly financial review) skill, plus 70+ finance-specific Claude skills.
- Compute/cloud partners: AWS (Annapurna Labs, Trainium), Google (TPUs), Microsoft, Broadcom, NVIDIA, SpaceX Colossus (Memphis).
- Customers/peers mentioned: Uber (two double-digit-million-dollar commits signed in a 20-minute Uber ride), OpenAI, xAI, Meta (poaching attempts — Anthropic lost only two researchers when other labs lost dozens), DeepSeek (Series E close coincided with the DeepSeek news).
- Sponsors mentioned (not endorsements, just present in the episode): Ridgeline, Rogo, WorkOS, Vanta, Ramp.
Risks and what could go wrong
- Diffusion lag — enterprises adopting more slowly than capability advances.
- Scaling laws breaking (not what they currently see).
- Losing the frontier position to a competitor.
Mythos and safety
- Mythos is the first model Anthropic released in a phased way because of spiked cyber capability (a prior model found 22 vulns in an open-source codebase; Mythos found 250). Anthropic frames its safety investments (interpretability, alignment science) as both mission-driven and a commercial moat with enterprise buyers entrusting them with sensitive workloads.
Chapter Summaries
-
Compute as canvas and the cone of uncertainty — Compute is the most consequential decision in the company; buy too much and you go out of business, too little and you can’t serve customers or stay at the frontier. They model bottoms-up demand, build flexibility into deals, and Rao spends 30-40% of his time on compute.
-
Fungible compute across three chip platforms — Anthropic uses AWS Trainium, Google TPUs, and NVIDIA GPUs interchangeably. The orchestration layer and custom compilers took years to build and let them route workloads dynamically across training, internal use, and serving.
-
Daily compute allocation meetings — A culturally collaborative, non-zero-sum process. There is a floor on model-development compute they will not breach; everything else is dynamically reallocated based on ROI discussions.
-
Model efficiency and the returns to frontier intelligence — Each new Opus generation is a multiple more efficient at processing tokens, which compounds because RL is itself inference in a sandbox. Frontier returns are very high in enterprise (TAM unlocks per release), less obvious in consumer.
-
Recursive self-improvement — 90%+ of Anthropic code is written by Claude Code. Talent density plus the best models accelerates research. Scaling laws are not slowing down from their vantage point.
-
Procurement strategy — Near-term partnerships (SpaceX Colossus/Memphis) plus long-term deals (5 GW with Google/Broadcom for 2027 TPUs, up to 5 GW with Amazon Trainium, $100B+ in commitments). A layer cake of compute optimized across price, performance, duration, and location.
-
Metabolism for new compute — Compute would be absorbed almost instantly today due to fungibility, in a way that wasn’t true a year or two ago.
-
Platform vs. application — Mostly horizontal/platform. Vertical bets (Claude Code, Claude for Financial Services, etc.) are either ahead-of-model-capability bets or template-demonstrations done in partnership with the ecosystem.
-
Customer fear and partner orientation — Acknowledges that customers fear being competed with; mitigates via early-access programs, listening loops, and explicit guidance (“build for that capability — we’ll improve it”).
-
Pricing and margins — Few price changes; lowered Opus pricing to drive Jevons-paradox consumption. They measure return on the full compute envelope rather than per-token variable cost.
-
Using Claude internally on the finance team — Statutory financial statements produced by Claude (human reviewed), an MFR skill that gets monthly review 90-95% ready, 70+ finance skills in a shared library, weekly reports cut from hours to 30 minutes. Token-usage leaderboard internally; head of tax is top user.
-
Capital formation and investor evolution — $75B raised since Rao joined, ~$50B more committed. Investors have moved from “do you need a frontier model?” and “your sales force is small” to understanding model-led growth, exponential adoption, and the safety/enterprise-trust linkage.
-
Hardest thing to explain to investors — Compute fungibility. It doesn’t map onto a traditional software variable-cost model or a factory R&D vs. COGS split.
-
Questions Rao would ask labs as an investor — ROI on compute, customer ROI and whether deployments are real vs. pilots, and future compute sourcing/balance.
-
Public sentiment and government relations — AI polls below Congress; industry needs to better articulate the upside (drug development, healthcare access) while honestly naming the risks. Anthropic is “America first,” works closely with administration, and used Mythos’s phased release as a template.
-
Culture — Seven still-present co-founders, a real culture interview that can veto otherwise stellar candidates, “competitors are incredibly capable and success is far from guaranteed” stickers, biweekly transparent all-hands by Dario with unscripted Q&A, rigorous debate followed by genuine alignment. Lost only two researchers during the Meta poaching wave when others lost dozens.
-
The frontier ahead — The “virtual collaborator”: context, tools, memory, long-horizon work on ideas. Cowork is outpacing Claude Code’s early growth. ~$40T of global knowledge work is the addressable target.
-
Personal scaling and the Tom Brown walk — Early-2024 walk with chief compute officer Tom Brown laid out a vision that “sounded crazy” and has largely come true. Rao’s playbook: first principles, hire partners, pattern-match from past roles (Blackstone PE, Airbnb pandemic financing), and weekly moments of conscious appreciation.
-
Pre-mortem — What could push them to the low end of the cone: diffusion hitting a wall, scaling laws stalling, or losing frontier position.
-
Optimistic close — Biotech and healthcare: AI-accelerated drug discovery and clinical workflows so that diseases diagnosed today might be cured within a patient’s lifetime.
-
Closing question — The kindest thing: Rao’s older brother chose an in-state college 25-30 years ago partly to preserve financial flexibility so Rao could later attend wherever he wanted — a sacrifice Rao only learned about years later.