CloudQuery — Sync Cloud Infrastructure to SQL for Security and Compliance

Introduction

CloudQuery is an open-source ELT (Extract, Load, Transform) framework purpose-built for infrastructure and security data. It connects to cloud providers (AWS, GCP, Azure), SaaS tools (GitHub, Okta, Cloudflare), and other APIs through a plugin system, extracts their configuration and resource data, and loads it into SQL databases or data lakes. Security and platform teams use CloudQuery to build asset inventories, run compliance queries, detect drift, and feed SIEM pipelines.

What CloudQuery Does

Syncs cloud resource configurations from 100+ sources into PostgreSQL, BigQuery, S3, or Snowflake
Provides pre-built source plugins for AWS, GCP, Azure, Kubernetes, GitHub, Okta, and many more
Enables SQL-based security queries like finding public S3 buckets or unencrypted volumes
Supports incremental syncs so only changed resources are updated on subsequent runs
Ships with policy packs for CIS benchmarks and compliance frameworks

Architecture Overview

CloudQuery uses a plugin-based architecture with source plugins and destination plugins connected by a high-performance gRPC streaming protocol. Each source plugin authenticates with a cloud API, fetches resources according to the configured table list, and streams rows to the destination plugin which writes them to the target database. Plugins run as separate processes, enabling language-agnostic development. The CLI orchestrates plugin lifecycle, schema migrations, and sync scheduling.

Self-Hosting & Configuration

Install the cloudquery CLI via Homebrew, package managers, or download from GitHub releases
Create a YAML config file defining source plugins (cloud providers) and destination plugins (databases)
Configure authentication using standard cloud credentials (AWS profiles, GCP service accounts)
Select specific tables to sync for faster, targeted data extraction
Schedule syncs with cron or integrate into CI/CD pipelines for continuous asset inventory

Key Features

100+ source plugins covering major cloud providers, SaaS platforms, and infrastructure tools
High-performance Go-based sync engine that handles millions of resources efficiently
SQL-native approach lets you query infrastructure data with standard SQL joins and aggregations
Pre-built compliance and security policy packs for CIS AWS, GCP, and Azure benchmarks
Extensible plugin SDK for building custom source or destination plugins in Go or Python

Comparison with Similar Tools

Steampipe — SQL-based cloud querying with live API calls per query, while CloudQuery syncs data to a persistent database for faster repeated queries
Prowler — focused on security checks and compliance scoring, not general-purpose asset inventory
AWS Config — native AWS service for resource tracking but limited to AWS and expensive at scale
Cartography (Lyft) — Neo4j-based infrastructure graph but narrower source coverage and less active
Resoto (Some Engineering) — cloud asset inventory with graph model but smaller community and plugin ecosystem

FAQ

Q: How often should I run CloudQuery syncs? A: Most teams run syncs every 1-4 hours for security use cases. For compliance snapshots, daily or weekly syncs are common. Incremental mode reduces sync time significantly.

Q: Does CloudQuery support multi-cloud environments? A: Yes. Configure multiple source plugins in the same config file. AWS, GCP, and Azure resources land in the same database, enabling cross-cloud SQL queries.

Q: Can I write custom source plugins? A: Yes. CloudQuery provides an SDK for building source plugins in Go or Python. The plugin communicates with the CLI via gRPC and follows a standard table/column schema.

Q: What databases does CloudQuery support as destinations? A: PostgreSQL, BigQuery, Snowflake, S3 (Parquet/CSV), ClickHouse, Elasticsearch, Apache Kafka, and more via destination plugins.

CloudQuery — Sync Cloud Infrastructure to SQL for Security and Compliance

Introduction

What CloudQuery Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

Finch — Open Source Container Development by AWS

Tetragon — eBPF-Based Security Observability for Kubernetes

Terratest — Automated Testing for Infrastructure Code