Quick Use
npx apify-cli create my-scraper(pick a template)- Edit
src/main.tsto define your crawl logic apify runlocally;apify pushto deploy to Apify Cloud
Intro
The Apify SDK packages a Crawlee or Playwright script into an Apify Actor — a containerized program that runs on Apify's cloud with built-in proxy rotation, retries, dataset persistence, request queue, and scheduling. You write the scraping logic; the SDK handles the infra. Best for: production scrapers / browser-automation agents that need reliability + observability without rolling your own queue. Works with: Node 20+, Python 3.10+. Setup time: 5 minutes (npx apify-cli create).
Scaffold an Actor
npx apify-cli create my-scraper
cd my-scraper
# Pick the "Crawlee + PlaywrightCrawler" templateWrite the scraping logic
// src/main.ts
import { Actor, log } from "apify";
import { PlaywrightCrawler } from "crawlee";
await Actor.init();
const { startUrls, maxRequests = 100 } = (await Actor.getInput<{
startUrls: string[];
maxRequests?: number;
}>())!;
const crawler = new PlaywrightCrawler({
maxRequestsPerCrawl: maxRequests,
async requestHandler({ page, request, enqueueLinks }) {
log.info(`Crawling ${request.url}`);
const title = await page.title();
const content = await page.locator("article").textContent();
await Actor.pushData({
url: request.url,
title,
content: content?.slice(0, 5000),
});
await enqueueLinks({ globs: [`${request.url}**`] });
},
});
await crawler.run(startUrls);
await Actor.exit();Run locally vs in cloud
# Locally
apify run
# Push to Apify cloud
apify push
# Run via API
curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \
-d '{ "startUrls": ["https://example.com/blog"], "maxRequests": 50 }'Use the dataset
# Fetch results as JSON / CSV / XLSX
curl "https://api.apify.com/v2/acts/<ACTOR_ID>/runs/last/dataset/items?format=json"Apify Actors auto-paginate, retry failed pages, rotate proxies, and persist crawl state across runs. The 4,000+ Apify Store Actors are built on this same SDK.
FAQ
Q: Is Apify free? A: Yes — Crawlee (the underlying library) is Apache-2.0 open-source. The Apify cloud has a free tier ($5/mo platform credit) and paid plans for production. Self-hosting Crawlee on your own infra is fully free.
Q: Crawlee vs Apify SDK? A: Crawlee is the standalone scraping library (Apache-2.0). The Apify SDK wraps Crawlee with cloud features (Actor.init, getInput, pushData, proxy config). For local-only scrapers, just use Crawlee.
Q: Can I publish my Actor to the Apify Store? A: Yes — actors can be public (free or paid usage-based) on apify.com/store. Apify takes a cut of paid usage and handles billing. Many Apify Store Actors are run by AI agents via the API.
Source & Thanks
Built by Apify. Licensed under Apache-2.0 (Crawlee).
apify/apify-sdk-js — ⭐ Active