SkillsApr 1, 2026·1 min read

TaskWeaver — Code-First Data Analytics Agent

TaskWeaver is a Microsoft code-first agent framework for data analytics tasks. 6.1K+ stars. Planning, stateful execution, DataFrames, plugins. MIT.

TL;DR
TaskWeaver plans data analytics tasks, generates code, and manages DataFrames with plugins.
§01

What it is

TaskWeaver is a code-first agent framework by Microsoft designed for data analytics tasks. It takes natural language requests, decomposes them into a plan, generates Python code for each step, executes the code in a stateful session, and returns results with DataFrames and visualizations.

TaskWeaver targets data analysts and engineers who want an AI agent that writes and runs analytics code rather than just answering questions in text.

§02

How it saves time or tokens

TaskWeaver generates and executes code in a persistent session. Variables, DataFrames, and computation results persist across steps. This means the agent builds on previous work instead of starting fresh each time, reducing redundant computation and token usage.

The plugin system lets you extend TaskWeaver with domain-specific functions (database connectors, API calls, custom transformations) that the agent can invoke by name.

§03

How to use

  1. Clone the TaskWeaver repository
  2. Configure your LLM provider in the config file
  3. Add plugins for your data sources (SQL connectors, file readers, APIs)
  4. Run TaskWeaver and describe your analytics task in natural language
§04

Example

# TaskWeaver plugin: sql_query.py
from taskweaver.plugin import Plugin, register_plugin

@register_plugin
class SQLQuery(Plugin):
    def __call__(self, query: str) -> dict:
        '''Execute a SQL query and return results as a DataFrame.'''
        import pandas as pd
        import sqlalchemy
        engine = sqlalchemy.create_engine(self.config['db_url'])
        df = pd.read_sql(query, engine)
        return {'data': df, 'row_count': len(df)}

# User interaction:
# > Load sales data from the database and show monthly revenue trends
# TaskWeaver: Planning...
# Step 1: Query sales table using sql_query plugin
# Step 2: Group by month and calculate revenue
# Step 3: Generate a line chart
§05

Related on TokRepo

§06

Common pitfalls

  • TaskWeaver executes arbitrary Python code; run it in a sandboxed environment, never on production machines with sensitive data access
  • The planner sometimes generates overly complex plans for simple tasks; provide clear, specific instructions
  • Plugin configuration requires matching the exact function signature that TaskWeaver expects; follow the plugin template

Frequently Asked Questions

How does TaskWeaver differ from ChatGPT Code Interpreter?+

ChatGPT Code Interpreter runs in a sandboxed cloud environment. TaskWeaver runs locally with access to your actual databases, files, and APIs through plugins. This makes it suitable for enterprise data analytics where data cannot leave your infrastructure.

What LLMs does TaskWeaver support?+

TaskWeaver supports OpenAI GPT-4, Azure OpenAI, and other providers compatible with the OpenAI API format. GPT-4 class models are recommended for reliable code generation.

Can TaskWeaver handle multi-step analytics?+

Yes. TaskWeaver decomposes complex requests into sequential steps with a planner. Each step generates code, executes it, and passes results to the next step. The stateful session preserves DataFrames across steps.

How do plugins work?+

Plugins are Python classes decorated with @register_plugin. They define a callable function with type hints. TaskWeaver discovers plugins automatically and lets the agent invoke them by name during task execution.

Is TaskWeaver open source?+

Yes. TaskWeaver is released under the MIT license by Microsoft. The source code, example plugins, and documentation are available on GitHub.

Citations (3)
🙏

Source & Thanks

Created by Microsoft. MIT. microsoft/TaskWeaver — 6,100+ GitHub stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets