# Maxun — Self-Hosted No-Code Web Scraping Platform > An open-source no-code platform for web scraping, crawling, and AI data extraction that turns websites into structured APIs. ## Install Save in your project root: # Maxun — Self-Hosted No-Code Web Scraping Platform ## Quick Use ```bash git clone https://github.com/getmaxun/maxun.git cd maxun cp .env.example .env docker compose up -d # Open http://localhost:3000 ``` ## Introduction Maxun is an open-source no-code web scraping platform that lets users visually extract data from websites without writing code. It uses Playwright for browser automation and provides a point-and-click interface to define extraction rules, making web scraping accessible to non-developers while remaining self-hostable for full data control. ## What Maxun Does - Enables visual point-and-click data extraction from any website without coding - Automates pagination, scrolling, and multi-page crawling with built-in logic - Exports scraped data as JSON, CSV, or directly into databases via API - Schedules recurring scraping jobs with cron-based automation - Provides anti-detection features including proxy rotation and browser fingerprint management ## Architecture Overview Maxun is built on a Node.js backend with a React frontend. It uses Playwright as the browser automation engine to render pages and execute extraction workflows. A PostgreSQL database stores workflow definitions and scraped results. The platform runs headless Chromium instances in Docker containers, with a WebSocket-based real-time preview that shows the browser as users define their extraction rules. ## Self-Hosting & Configuration - Deploy with Docker Compose using the provided configuration with Postgres and Redis services - Set environment variables in `.env` for database credentials, proxy settings, and API keys - Configure proxy rotation by adding proxy URLs to the designated environment variable - Adjust concurrency settings to control how many parallel scraping sessions run - Expose the web UI on your preferred port and secure with a reverse proxy for production use ## Key Features - Visual no-code workflow builder with live browser preview - Built-in pagination and infinite scroll handling - Scheduled and recurring scraping with cron expressions - Proxy support with rotation for anti-blocking - REST API for programmatic trigger and data retrieval ## Comparison with Similar Tools - **Scrapy** — Python framework requiring code; Maxun offers a visual no-code interface - **Crawlee** — Developer-focused Node.js library vs Maxun's point-and-click approach - **Apify** — Cloud SaaS platform; Maxun is fully self-hosted with no per-page costs - **Browse AI** — Closed-source cloud tool; Maxun gives you full control of your data - **Firecrawl** — API-first crawling for LLMs; Maxun focuses on structured data extraction with visual workflows ## FAQ **Q: Does Maxun handle JavaScript-rendered pages?** A: Yes. Maxun uses Playwright with full Chromium rendering, so it handles SPAs and dynamic content. **Q: Can I run Maxun on low-resource servers?** A: Each scraping session uses a headless browser instance. For production, at least 2 GB RAM per concurrent session is recommended. **Q: How do I avoid getting blocked?** A: Maxun supports proxy rotation, request delays, and user-agent randomization to reduce detection risk. **Q: Is there an API to trigger scrapes programmatically?** A: Yes, all workflows can be triggered and results retrieved via the REST API. ## Sources - https://github.com/getmaxun/maxun - https://maxun.dev --- Source: https://tokrepo.com/en/workflows/bcbe0dcf-3cd3-11f1-9bc6-00163e2b0d79 Author: AI Open Source