What is Maxun — Self-Hosted No-Code Web Scraping Platform?

An open-source no-code platform for web scraping, crawling, and AI data extraction that turns websites into structured APIs.

Is Maxun — Self-Hosted No-Code Web Scraping Platform free to use?

Yes. Maxun — Self-Hosted No-Code Web Scraping Platform is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Maxun — Self-Hosted No-Code Web Scraping Platform?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Maxun — Self-Hosted No-Code Web Scraping Platform

Introduction

Maxun is an open-source no-code web scraping platform that lets users visually extract data from websites without writing code. It uses Playwright for browser automation and provides a point-and-click interface to define extraction rules, making web scraping accessible to non-developers while remaining self-hostable for full data control.

What Maxun Does

Enables visual point-and-click data extraction from any website without coding
Automates pagination, scrolling, and multi-page crawling with built-in logic
Exports scraped data as JSON, CSV, or directly into databases via API
Schedules recurring scraping jobs with cron-based automation
Provides anti-detection features including proxy rotation and browser fingerprint management

Architecture Overview

Maxun is built on a Node.js backend with a React frontend. It uses Playwright as the browser automation engine to render pages and execute extraction workflows. A PostgreSQL database stores workflow definitions and scraped results. The platform runs headless Chromium instances in Docker containers, with a WebSocket-based real-time preview that shows the browser as users define their extraction rules.

Self-Hosting & Configuration

Deploy with Docker Compose using the provided configuration with Postgres and Redis services
Set environment variables in .env for database credentials, proxy settings, and API keys
Configure proxy rotation by adding proxy URLs to the designated environment variable
Adjust concurrency settings to control how many parallel scraping sessions run
Expose the web UI on your preferred port and secure with a reverse proxy for production use

Key Features

Visual no-code workflow builder with live browser preview
Built-in pagination and infinite scroll handling
Scheduled and recurring scraping with cron expressions
Proxy support with rotation for anti-blocking
REST API for programmatic trigger and data retrieval

Comparison with Similar Tools

Scrapy — Python framework requiring code; Maxun offers a visual no-code interface
Crawlee — Developer-focused Node.js library vs Maxun's point-and-click approach
Apify — Cloud SaaS platform; Maxun is fully self-hosted with no per-page costs
Browse AI — Closed-source cloud tool; Maxun gives you full control of your data
Firecrawl — API-first crawling for LLMs; Maxun focuses on structured data extraction with visual workflows

FAQ

Q: Does Maxun handle JavaScript-rendered pages? A: Yes. Maxun uses Playwright with full Chromium rendering, so it handles SPAs and dynamic content.

Q: Can I run Maxun on low-resource servers? A: Each scraping session uses a headless browser instance. For production, at least 2 GB RAM per concurrent session is recommended.

Q: How do I avoid getting blocked? A: Maxun supports proxy rotation, request delays, and user-agent randomization to reduce detection risk.

Q: Is there an API to trigger scrapes programmatically? A: Yes, all workflows can be triggered and results retrieved via the REST API.

Maxun — Self-Hosted No-Code Web Scraping Platform

Introduction

What Maxun Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discusión

Activos relacionados

Mathesar — Open-Source Database Interface for PostgreSQL

Livebook — Interactive Notebooks for Elixir

Nango — Open-Source Platform for Product API Integrations