返回 Skill 列表
extension
分类: AI Agent 能力无需 API Key

autoresearch

Karpathy风格的保留/回滚实验循环,适用于Atris实验包。在改进提示、工具、工作人员或有限制的仓库目标时使用。

person作者: jakexiaohubgithub

Autoresearch Skill

Autoresearch means one bounded target, one external metric, one keep/revert loop, one append-only log.

When to use

  • prompt optimization
  • worker routing
  • tool behavior
  • evaluation harnesses
  • any repo-local target that can be measured honestly

Process

  1. Read atris/experiments/<slug>/program.md
  2. Confirm the target is bounded
  3. Run the baseline with measure.py
  4. Apply one candidate change
  5. Rerun the metric
  6. Keep only if the score improves
  7. Write the outcome to results.tsv
  8. Revert losses

Rules

  • external metric only
  • no unlogged keeps
  • no broad refactors inside an experiment
  • one experiment pack = one target
  • if variance exists, define the keep margin first

Commands

atris experiments init <slug>
atris experiments validate
atris experiments benchmark

Good output

  • short program.md
  • honest measure.py
  • deterministic loop.py
  • append-only results.tsv

Bad output

  • "felt better"
  • changed three things at once
  • kept a patch without a measured win
  • no reset/revert path