AI Scientist — Automated Research Paper Generation
Fully automated AI system that conducts research, runs experiments, and writes complete scientific papers. Generates novel ideas, implements them, and produces LaTeX manuscripts. 12,000+ stars.
What it is
AI Scientist is a fully automated research system by Sakana AI that takes a research template and compute budget, then autonomously generates novel ideas, designs experiments, runs code, analyzes results, and produces complete scientific papers in LaTeX. The output includes literature review, methodology, results, and discussion sections.
This tool targets researchers exploring AI-assisted scientific discovery, labs looking to accelerate hypothesis exploration, and anyone studying the frontier of automated research. It works with Claude, GPT-4, and Gemini as the underlying LLM.
How it saves time or tokens
The traditional research cycle from idea to manuscript takes weeks or months of manual work. AI Scientist compresses the entire pipeline into a single command. It generates multiple candidate ideas, implements each as code, runs experiments, and writes up findings -- all without human intervention between steps.
The token_estimate for this workflow is approximately 2,800 tokens per run. The system supports batching multiple ideas in a single launch, amortizing setup overhead.
How to use
- Clone and install dependencies:
git clone https://github.com/SakanaAI/AI-Scientist.git
cd AI-Scientist
pip install -r requirements.txt
- Run the full pipeline with your chosen model and experiment template:
python launch_scientist.py \
--model claude-sonnet-4-20250514 \
--experiment nanoGPT \
--num-ideas 5
- Collect the output LaTeX manuscripts from the results directory.
Example
The system generates structured research ideas before implementing them:
Input: 'Improve training efficiency of small language models'
Generated Ideas:
1. Adaptive learning rate scheduling based on gradient noise
2. Curriculum learning with dynamic difficulty assessment
3. Sparse attention patterns for resource-constrained training
For each idea, AI Scientist writes experiment code, runs it, and produces a full paper:
# Check generated papers
ls results/nanoGPT/
# idea_1_paper.pdf idea_2_paper.pdf idea_3_paper.pdf
Related on TokRepo
- AI Tools for Research -- Research automation and discovery tools
- Prompt Library -- Reusable prompts for various AI workflows
Common pitfalls
- Generated papers should be treated as drafts requiring human review. The system may produce plausible-sounding but incorrect analysis.
- Experiment templates constrain the research scope. Without a well-designed template, the system may explore unproductive directions.
- API costs can accumulate quickly when generating multiple ideas with large models. Set --num-ideas conservatively for initial runs.
Frequently Asked Questions
AI Scientist works with Claude (Anthropic), GPT-4 (OpenAI), and Gemini (Google). You specify the model via the --model flag when launching the pipeline. Each model produces different quality and style of research output.
The papers are structured like academic manuscripts with proper sections, but they require human review for correctness, novelty claims, and scientific rigor. They are best used as research drafts or starting points for further investigation.
Cost depends on the model and number of ideas. The workflow estimates approximately 2,800 tokens per run. With 5 ideas and a large model like GPT-4, expect costs in the range of a few dollars per batch.
Yes. AI Scientist uses experiment templates that define the research domain and code structure. You can create custom templates following the existing examples like nanoGPT to target your specific research area.
It runs actual code. The system generates Python experiment scripts, executes them, collects metrics and plots, then writes the paper based on real results. This is not just text generation -- it includes code execution and data analysis.
Citations (3)
- AI Scientist GitHub— AI Scientist autonomously generates ideas, runs experiments, and writes papers
- Sakana AI— Sakana AI research on automated scientific discovery
- arXiv— Large language models for automated research and hypothesis generation
Related on TokRepo
Source & Thanks
Created by Sakana AI. Licensed under Apache 2.0.
AI-Scientist — ⭐ 12,000+
Thanks to Sakana AI for pushing the boundary of automated scientific discovery.