Claude Code Agent: Data Scientist — Analysis & Visualization
Claude Code agent for data science. Exploratory analysis, statistical modeling, visualization, feature engineering, and Jupyter notebooks.
Review-first install path
This asset needs a review step. The copied prompt tells the agent to dry-run, show the writes, then proceed only after confirmation.
npx -y tokrepo@latest install 381bb2f4-38e2-4236-b7b8-6ef3561cac93 --target codexDry-run first, confirm the writes, then run this command.
What it is
The Data Scientist agent is a Claude Code agent template focused on data science workflows. It activates automatically when you work on exploratory analysis, statistical modeling, data visualization, feature engineering, or Jupyter notebook tasks. The agent understands pandas, matplotlib, seaborn, scikit-learn, and common data science patterns.
This agent is for data scientists and analysts who use Claude Code and want domain-specific assistance. Instead of explaining your data science context every time, the agent comes pre-configured with knowledge of best practices for analysis workflows.
How it saves time or tokens
The agent provides data science context out of the box, eliminating the prompt engineering needed to make a general assistant produce clean analysis code. It knows to suggest appropriate chart types, handle missing data correctly, and structure notebooks with clear documentation. One command installs the agent.
How to use
- Install the Data Scientist agent template:
npx claude-code-templates@latest --agent data-ai/data-scientist --yes
- The agent activates automatically when data science tasks are detected.
- Ask it to handle analysis tasks:
'Explore this dataset and show distributions for all numeric columns'
'Build a correlation heatmap and identify the top features'
'Create a classification model and show the confusion matrix'
Example
# The agent generates structured analysis code like this:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('customers.csv')
# Summary statistics
print(df.describe())
print(f'Missing values:\n{df.isnull().sum()}')
# Distribution plots
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
for i, col in enumerate(df.select_dtypes('number').columns[:6]):
ax = axes[i // 3][i % 3]
df[col].hist(ax=ax, bins=30)
ax.set_title(col)
plt.tight_layout()
plt.savefig('distributions.png', dpi=150)
# Correlation matrix
corr = df.select_dtypes('number').corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
plt.savefig('correlation.png', dpi=150)
Related on TokRepo
- AI tools for research -- Research and analysis tools
- AI coding tools -- Developer productivity agents
Common pitfalls
- The agent generates code but does not execute it. Review the generated analysis code before running it on sensitive datasets.
- Large datasets may require chunked processing. The agent suggests appropriate techniques when it detects memory-intensive operations.
- The agent template requires Claude Code to be installed. It does not work with other AI coding tools.
Frequently Asked Questions
The agent understands pandas, numpy, matplotlib, seaborn, plotly, scikit-learn, scipy, statsmodels, and Jupyter. It generates code using whichever libraries are in your project or recommends the most appropriate ones.
Yes. The agent generates well-structured notebook cells with markdown documentation, code cells, and visualization outputs. It understands notebook conventions like putting imports at the top and separating analysis steps.
Yes. When analyzing a dataset, the agent checks for missing values, duplicates, data types, and outliers. It suggests appropriate cleaning strategies based on the data characteristics and your analysis goals.
Yes. The installed template consists of skill files in your project that you can edit. Add domain-specific instructions, preferred chart styles, or custom analysis templates to tailor the agent to your workflow.
Yes. The agent generates scikit-learn pipelines with preprocessing, model training, cross-validation, and evaluation metrics. It suggests appropriate models based on your data type (classification, regression, clustering).
Citations (3)
- Anthropic Claude Code Docs— Claude Code agent templates for domain-specific tasks
- pandas Documentation— pandas for data analysis in Python
- scikit-learn Documentation— scikit-learn for machine learning in Python
Related on TokRepo
Source & Thanks
Created by Claude Code Templates by davila7. Licensed under MIT. Install:
npx claude-code-templates@latest --agent data-ai/data-scientist --yes
Discussion
Related Assets
Claude Code Agent: SEO Specialist — Technical SEO Audit
Claude Code agent for technical SEO. Audit meta tags, structured data, Core Web Vitals, crawlability, and content optimization.
Claude Code Agent: Compliance Auditor — Regulatory Checks
Claude Code agent for compliance auditing. GDPR, SOC 2, HIPAA checks on code, data handling, logging, and access controls.
Claude Code Agent: K8s Specialist — Kubernetes Operations
Claude Code agent for Kubernetes. Deployment configs, helm charts, troubleshooting, scaling, monitoring, and cluster management.
Claude Code Agent: API Architect — Design REST & GraphQL APIs
Claude Code agent for API design. REST endpoints, GraphQL schemas, authentication, rate limiting, versioning, and documentation.