ConfigsApr 20, 2026·3 min read

Maxun — Self-Hosted No-Code Web Scraping Platform

An open-source no-code platform for web scraping, crawling, and AI data extraction that turns websites into structured APIs.

Introduction

Maxun is an open-source no-code web scraping platform that lets users visually extract data from websites without writing code. It uses Playwright for browser automation and provides a point-and-click interface to define extraction rules, making web scraping accessible to non-developers while remaining self-hostable for full data control.

What Maxun Does

  • Enables visual point-and-click data extraction from any website without coding
  • Automates pagination, scrolling, and multi-page crawling with built-in logic
  • Exports scraped data as JSON, CSV, or directly into databases via API
  • Schedules recurring scraping jobs with cron-based automation
  • Provides anti-detection features including proxy rotation and browser fingerprint management

Architecture Overview

Maxun is built on a Node.js backend with a React frontend. It uses Playwright as the browser automation engine to render pages and execute extraction workflows. A PostgreSQL database stores workflow definitions and scraped results. The platform runs headless Chromium instances in Docker containers, with a WebSocket-based real-time preview that shows the browser as users define their extraction rules.

Self-Hosting & Configuration

  • Deploy with Docker Compose using the provided configuration with Postgres and Redis services
  • Set environment variables in .env for database credentials, proxy settings, and API keys
  • Configure proxy rotation by adding proxy URLs to the designated environment variable
  • Adjust concurrency settings to control how many parallel scraping sessions run
  • Expose the web UI on your preferred port and secure with a reverse proxy for production use

Key Features

  • Visual no-code workflow builder with live browser preview
  • Built-in pagination and infinite scroll handling
  • Scheduled and recurring scraping with cron expressions
  • Proxy support with rotation for anti-blocking
  • REST API for programmatic trigger and data retrieval

Comparison with Similar Tools

  • Scrapy — Python framework requiring code; Maxun offers a visual no-code interface
  • Crawlee — Developer-focused Node.js library vs Maxun's point-and-click approach
  • Apify — Cloud SaaS platform; Maxun is fully self-hosted with no per-page costs
  • Browse AI — Closed-source cloud tool; Maxun gives you full control of your data
  • Firecrawl — API-first crawling for LLMs; Maxun focuses on structured data extraction with visual workflows

FAQ

Q: Does Maxun handle JavaScript-rendered pages? A: Yes. Maxun uses Playwright with full Chromium rendering, so it handles SPAs and dynamic content.

Q: Can I run Maxun on low-resource servers? A: Each scraping session uses a headless browser instance. For production, at least 2 GB RAM per concurrent session is recommended.

Q: How do I avoid getting blocked? A: Maxun supports proxy rotation, request delays, and user-agent randomization to reduce detection risk.

Q: Is there an API to trigger scrapes programmatically? A: Yes, all workflows can be triggered and results retrieved via the REST API.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets