What does a researcher actually do all day?

Write code. Run an experiment. Check the numbers. Tweak a parameter. Run it again. Wait. Repeat.

It's not glamorous. Most of the cycle is mechanical -- trial and error at human speed. And that's exactly the part Andrej Karpathy just automated.


His new project, AutoResearch, is disarmingly simple. Three files. One GPU. An AI agent that modifies training code, runs 5-minute experiments, evaluates results, and decides what to keep.

The constraint is the clever part. Every experiment gets exactly five minutes. No more. That means the agent can run about twelve experiments per hour. Leave it running overnight, wake up to a hundred completed trials with a full log of what worked and what didn't.


The repo has three files, and that's intentional:

  • prepare.py handles data and tokenizer setup. The agent doesn't touch it.
  • train.py contains the model and training loop. This is the agent's playground.
  • program.md is where the human writes strategy -- what to explore, what constraints to respect.

That last file is the interesting one. The human doesn't write experiment code anymore. They write experiment strategy. The agent handles execution.


This is a pattern I keep seeing. The value is shifting from doing the work to designing how the work gets done.

A junior researcher runs experiments. A senior researcher decides which experiments are worth running. AutoResearch puts an AI in the junior role -- tireless, fast, and available at 3 AM.

But here's what it doesn't do: it doesn't decide what problems matter. It doesn't set research direction. It doesn't know when a surprising result deserves a closer look from a human eye.


The real question isn't whether AI can run experiments. It obviously can.

The question is whether researchers will adapt their workflow to use it. The ones who do will move faster. The ones who don't will wonder why their colleagues suddenly have ten times the data.

Less about replacing researchers. More about redefining what research means when the grunt work disappears.