How do I install Apify Actor SDK — Headless Web Automation at Cloud Scale?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Apify Actor SDK — Headless Web Automation at Cloud Scale

简介

Apify SDK 把一段 Crawlee 或 Playwright 脚本打成一个 Apify Actor —— 在 Apify 云上跑的容器化程序，自带代理轮换、重试、数据集持久化、请求队列、定时任务。你写抓取逻辑，SDK 处理基础设施。适合需要可靠性 + 可观测性、又不想自己撸队列的生产抓取器 / 浏览器自动化 agent。需要 Node 20+ 或 Python 3.10+。装机时间 5 分钟（npx apify-cli create）。

起一个 Actor

npx apify-cli create my-scraper
cd my-scraper
# 选「Crawlee + PlaywrightCrawler」模板

写抓取逻辑

// src/main.ts
import { Actor, log } from "apify";
import { PlaywrightCrawler } from "crawlee";

await Actor.init();

const { startUrls, maxRequests = 100 } = (await Actor.getInput<{
  startUrls: string[];
  maxRequests?: number;
}>())!;

const crawler = new PlaywrightCrawler({
  maxRequestsPerCrawl: maxRequests,

  async requestHandler({ page, request, enqueueLinks }) {
    log.info(`Crawling ${request.url}`);

    const title = await page.title();
    const content = await page.locator("article").textContent();

    await Actor.pushData({
      url: request.url,
      title,
      content: content?.slice(0, 5000),
    });

    await enqueueLinks({ globs: [`${request.url}**`] });
  },
});

await crawler.run(startUrls);
await Actor.exit();

本地跑 vs 云端跑

# 本地
apify run

# 推到 Apify 云
apify push

# API 调用
curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \
  -d '{ "startUrls": ["https://example.com/blog"], "maxRequests": 50 }'

用数据集

# 拿结果 JSON / CSV / XLSX
curl "https://api.apify.com/v2/acts/<ACTOR_ID>/runs/last/dataset/items?format=json"

Apify Actor 自动分页、重试失败页、轮换代理、跨运行保留爬取状态。Apify Store 的 4000+ Actor 都基于同一套 SDK。

FAQ

Q: Apify 免费吗？ A: 免费 —— Crawlee（底层库）Apache-2.0 开源。Apify 云有免费档（每月 $5 平台 credit）和生产付费档。在自己基础设施上自托管 Crawlee 完全免费。

Q: Crawlee 跟 Apify SDK 啥区别？ A: Crawlee 是独立的抓取库（Apache-2.0）。Apify SDK 在 Crawlee 之上包了云功能（Actor.init / getInput / pushData / 代理配置）。只本地跑的话直接用 Crawlee。

Q: 我的 Actor 能发到 Apify Store 吗？ A: 能 —— Actor 可以公开（免费或按用量付费）在 apify.com/store。Apify 抽付费用量的成，处理计费。Store 上很多 Actor 都被 AI agent 通过 API 调用。

Apify Actor SDK — Headless Web Automation at Cloud Scale

这个资产会安全暂存

简介

起一个 Actor

写抓取逻辑

本地跑 vs 云端跑

用数据集

FAQ

来源与感谢

讨论

相关资产

Crawlee — Web Scraping and Browser Automation Library

Crawlee — Production Web Scraping for Node.js

Boto3 — The Official AWS SDK for Python

Dynamo — Datacenter-Scale Distributed Inference Serving Framework