# Snowplow — Open-Source Behavioral Data Platform > Event-level data collection platform that captures rich behavioral data from web, mobile, and server-side sources into your data warehouse. ## Install Save as a script file and run: # Snowplow — Open-Source Behavioral Data Platform ## Quick Use ```bash # Quick start with Docker Compose (Snowplow Micro for development) docker run -p 9090:9090 snowplow/snowplow-micro:latest # Send a test event curl http://localhost:9090/i?e=pv&url=https://example.com&p=web # View collected events curl http://localhost:9090/micro/all ``` ## Introduction Snowplow is an open-source behavioral data platform that collects granular, event-level data from websites, mobile apps, and server-side systems. Unlike tag-based analytics tools, Snowplow gives you full ownership of your raw data, delivering it directly into your data warehouse for analysis with your existing BI and data science stack. ## What Snowplow Does - Collects event-level behavioral data from web, mobile, and server-side trackers - Validates events against schemas to ensure data quality at collection time - Enriches events with geolocation, referrer parsing, campaign attribution, and custom logic - Loads validated data into warehouses like Snowflake, BigQuery, Redshift, or Databricks - Supports custom event schemas for domain-specific tracking beyond pageviews and clicks ## Architecture Overview Snowplow uses a pipeline architecture: trackers send events to a collector endpoint, which writes raw events to a stream (Kinesis, PubSub, or Kafka). An enrichment process validates events against Iglu schema registries, applies configurable enrichments, and outputs structured data. A loader then writes the enriched events into the target data warehouse in a well-defined table schema. ## Self-Hosting & Configuration - Deploy the collector, enrichment, and loader components via Docker or cloud-native services - Use Snowplow Micro (single Docker container) for local development and testing - Define custom event schemas in an Iglu schema registry for type-safe data collection - Configure enrichments (IP lookup, UA parsing, campaign attribution) via JSON files - Supported warehouse targets include Snowflake, BigQuery, Redshift, Databricks, and PostgreSQL ## Key Features - Schema-driven data collection validates every event before it enters the pipeline - First-party data collection keeps all behavioral data in your own infrastructure - 20+ configurable enrichments add context without additional tracking code - Trackers available for JavaScript, iOS, Android, Python, Go, Java, and more - Real-time and batch loading modes for different latency requirements ## Comparison with Similar Tools - **Google Analytics** — Aggregated metrics in a SaaS dashboard; Snowplow delivers raw event data to your warehouse - **Segment** — SaaS data router; Snowplow is self-hosted with schema validation and enrichment - **RudderStack** — Open-source CDP; Snowplow focuses on behavioral data with richer schema validation - **Matomo** — Self-hosted web analytics; Snowplow provides a data pipeline, not a pre-built dashboard - **PostHog** — Product analytics with built-in UI; Snowplow is a data infrastructure layer for warehouse-first teams ## FAQ **Q: Where does Snowplow store collected data?** A: Snowplow loads data into your data warehouse (Snowflake, BigQuery, Redshift, Databricks, or PostgreSQL). You own and control all data. **Q: Can I define custom events beyond pageviews?** A: Yes. Snowplow uses JSON schemas in an Iglu registry to define custom event types and entities with full validation. **Q: Is Snowplow suitable for high-traffic sites?** A: Yes. Snowplow pipelines built on Kinesis, PubSub, or Kafka handle billions of events per day. **Q: How does Snowplow compare to a CDP?** A: Snowplow focuses on behavioral data collection and delivery to your warehouse. CDPs typically add audience building and activation on top. ## Sources - https://github.com/snowplow/snowplow - https://docs.snowplow.io/ --- Source: https://tokrepo.com/en/workflows/asset-2c4ae706 Author: Script Depot