Skip to content

/neuroflow:autoresearch

Infinite improvement loop โ€” point it at any file(s) and it runs a worker-evaluator cycle indefinitely, keeping or reverting each change based on whether it improved the artifact.

Inspired by Andrej Karpathy's autoresearch. Runs until you interrupt it.


When to use it

  • You want to improve a hypothesis, paper section, grant aim, or analysis plan overnight
  • You want to leave something running and come back to a better version
  • You want a live dashboard showing quality trends across iterations
  • You want to explore what "continuous improvement" looks like for a research artifact

How it works

First run โ€” initialization

  1. Claude determines the active phase from project_config.md
  2. You name the files to improve (or use --target path/to/file.md)
  3. Criteria are built in three layers:
  4. Phase defaults (e.g. paper: language, claim support, statistics, reproducibility, novelty)
  5. Context-inferred (e.g. adds "aligns with preregistered hypotheses" if preregistration/ exists)
  6. Your additions (you can add any criteria before the loop starts)
  7. A baseline snapshot is saved to history/v000/
  8. A local dashboard server (server.py) is generated
Dashboard: python .neuroflow/{phase}/autoresearch/server.py โ†’ http://localhost:8765

The loop โ€” never stops until you interrupt

Each iteration: 1. Worker โ€” makes one focused change to the tracked files (targets the weakest criterion) 2. Evaluator โ€” compares the new version to the previous best and returns BETTER / WORSE / NO CHANGE 3. Keep or revert โ€” if BETTER, the new version is archived to history/vNNN/; if WORSE, the previous best is restored 4. Log โ€” result logged to results.md with verdict, delta, and next focus


Invocation forms

Form Behaviour
/autoresearch Uses active phase from project_config.md
/autoresearch paper Targets the paper phase explicitly
/paper autoresearch Any phase command + autoresearch keyword triggers this
/paper autoresearch --target manuscript/intro.md Pre-fills the tracked file

Dashboard

python .neuroflow/{phase}/autoresearch/server.py
# โ†’ http://localhost:8765
# โ†’ http://localhost:8765?watch=1   (auto-refresh every 30s)

The dashboard shows: - Quality curve โ€” running delta over iterations (staircase: rises on KEPT, flat on REVERTED) - Numeric metric charts โ€” power, Rยฒ, rejection rate, etc. when applicable - Last recommendation โ€” what the evaluator says to target next - Plateau warning โ€” triggered after 5 consecutive reversions


Files created

.neuroflow/{phase}/autoresearch/
โ”œโ”€โ”€ flow.md
โ”œโ”€โ”€ program.md         # task + criteria (edit this to guide the loop)
โ”œโ”€โ”€ __thetask__.md     # pointer to tracked files
โ”œโ”€โ”€ results.md         # iteration log
โ”œโ”€โ”€ server.py          # dashboard server
โ””โ”€โ”€ history/
    โ”œโ”€โ”€ v000/          # baseline
    โ”œโ”€โ”€ v001/          # first KEPT improvement
    โ””โ”€โ”€ ...

Files read and written

Direction Files
Reads .neuroflow/project_config.md, .neuroflow/flow.md, tracked external files, program.md, results.md, history/ snapshots
Writes .neuroflow/{phase}/autoresearch/, tracked external files (on KEPT), history/vNNN/ snapshots

  • neuroflow:autoresearch skill โ€” full protocol, criteria, and dashboard template
  • /paper โ€” uses the worker-critic loop (bounded, 3 iterations) for section drafting
  • /pipeline โ€” multi-step orchestration across phases