Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

No Priors · Sarah Guo — Andrej Karpathy · March 20, 2026 · Original

Most important take away

The defining skill of this era is removing yourself as the bottleneck. Instead of interactively prompting agents one at a time, the goal is to maximize your “token throughput” by running multiple autonomous agents in parallel, giving them clear objectives and metrics, and letting them work without your involvement. Everything that fails increasingly feels like a “skill issue” in how you orchestrate agents, not a limitation of the technology itself.

Chapter Summaries

The Agent Psychosis and New Workflow

Karpathy describes a fundamental shift since December 2025 where he went from writing 80% of code himself to essentially 0%. The new workflow involves running multiple coding agents in parallel (Claude Code, Codex, etc.), giving them macro-level tasks, and reviewing their output. The bottleneck has shifted from typing speed to your ability to effectively instruct and parallelize agents. Token throughput is the new GPU utilization metric.

Agent Personality and the Claw Paradigm

Discussion of agent personality mattering more than expected. Claude is praised for feeling like an engaged teammate, while Codex’s coding agent is described as “dry.” The “claw” concept (persistent, looping agents with memory) represents the next level beyond interactive sessions. Karpathy built “Dobby,” a home automation claw that discovered and controls his Sonos, lights, HVAC, shades, pool, and security cameras through WhatsApp, replacing six separate apps.

AutoResearch and Recursive Self-Improvement

Karpathy’s AutoResearch project lets an autonomous loop optimize model training hyperparameters overnight. Even on his already well-tuned nanoChat repo, it found improvements he missed (weight decay on embeddings, Adam betas). The vision scales to frontier labs running swarms of auto-researchers. He proposes an open, distributed version resembling folding@home, where untrusted workers submit code commits that are cheap to verify but expensive to discover.

The Jaggedness Problem

Current models exhibit extreme jaggedness: simultaneously brilliant PhD-level systems programmers and 10-year-olds. They excel in RL-verifiable domains (code, math) but remain stuck in non-optimized areas (e.g., ChatGPT still tells the same joke from years ago). This suggests intelligence gains in verifiable domains are not fully generalizing to other capabilities, contrary to some research premises.

Model Speciation vs. Monoculture

Labs currently push single general-purpose models, but Karpathy predicts more speciation ahead, analogous to the animal kingdom. Smaller specialized models with a cognitive core could be more efficient for specific domains. The science of fine-tuning without losing capabilities is still underdeveloped, which limits speciation today.

Open Source vs. Closed Frontier Models

Open source models are now roughly 6-8 months behind the frontier, down from 18+ months. Karpathy draws the Linux analogy: the industry needs a common open platform. He sees a healthy dynamic where frontier labs push capability boundaries while open source handles the vast majority of consumer and business use cases, with frontier intelligence reserved for Nobel-prize-level or massive-scope work.

Job Market and Software Demand

Karpathy examined BLS jobs data and sees Jevons paradox playing out in software: as software gets cheaper to produce, demand increases. Digital-first professions will see the most disruption first since bits are far easier to manipulate than atoms. His advice: these tools are empowering right now, so staying current with them is the priority. Long-term forecasting is genuinely uncertain.

Physical World and Robotics

Physical-world AI will lag digital by a significant margin because atoms are “a million times harder” than bits. The interesting near-term frontier is the interface between digital and physical: sensors feeding data to AI and actuators executing its decisions. Karpathy sees digital unhobbling happening first, then sensor/actuator interfaces, then full physical-world robotics.

Education and MicroGPT

MicroGPT distills LLM training to 200 lines of Python. Karpathy’s key insight about education: he no longer explains things to people but to agents. Agents can then personalize explanations for any learner. Documentation should target agents (markdown) rather than humans (HTML). The educator’s role shifts to providing the irreducible creative bits that agents cannot generate themselves.

Summary

Actionable Insights:

Maximize agent parallelism now. Run multiple coding agents simultaneously on non-conflicting tasks. If you have unused subscription capacity, you are leaving productivity on the table. Develop muscle memory for decomposing work into independent macro-actions that agents can execute concurrently.
Build autonomous loops with clear metrics. AutoResearch works because validation loss is an objective, cheap-to-verify metric. Apply this pattern anywhere you can define a measurable objective: let agents iterate overnight instead of being in the loop yourself. The key abstraction is a program.md file that describes how the research or optimization should proceed.
Treat agent orchestration as a first-class skill. The difference between mediocre and exceptional output is increasingly about how you instruct agents: your system prompts, memory tools, task decomposition, and review processes. This is the new “engineering” skill to develop.
Shift documentation and education toward agents. Instead of writing docs for humans, write markdown for agents. Instead of recording tutorials, create “skills” that script the curriculum an agent should walk a learner through. The value-add for humans is the irreducible creative insight; everything downstream is agent territory.
Expect digital disruption before physical. If you are choosing where to invest effort, digital-first domains will see the most change soonest. The interface layer between digital intelligence and the physical world (sensors, data collection, lab automation) is the next frontier. Full physical-world robotics is a larger market but will take significantly longer.
Career advice: stay hands-on with the tools. Karpathy’s core advice is simply to keep up. Many people dismiss or fear AI tools. The ones who treat them as empowering productivity multipliers and invest time learning to use them effectively will have a significant advantage. Software engineering demand is likely increasing due to Jevons paradox, but the nature of the work is changing rapidly.

Tech patterns mentioned:

RL-trained models excel in verifiable domains but plateau in non-verifiable ones, creating persistent jaggedness.
Open source models converging to within 6-8 months of frontier, a trend Karpathy expects to continue.
The “claw” pattern (persistent, looping agents with memory and tool access) as the next evolution beyond interactive agent sessions.
Agent-first API design replacing consumer app UIs, since the customer increasingly is the agent, not the human.
Distributed auto-research resembling proof-of-work systems, where compute contributions replace financial donations to research causes.