Scripts2026年4月16日·1 分钟阅读

CloudQuery — Sync Cloud Infrastructure to SQL for Security and Compliance

CloudQuery is an open-source ELT framework that extracts configuration data from cloud APIs, SaaS platforms, and databases into PostgreSQL or data lakes for security, compliance, and asset visibility.

Introduction

CloudQuery is an open-source ELT (Extract, Load, Transform) framework purpose-built for infrastructure and security data. It connects to cloud providers (AWS, GCP, Azure), SaaS tools (GitHub, Okta, Cloudflare), and other APIs through a plugin system, extracts their configuration and resource data, and loads it into SQL databases or data lakes. Security and platform teams use CloudQuery to build asset inventories, run compliance queries, detect drift, and feed SIEM pipelines.

What CloudQuery Does

  • Syncs cloud resource configurations from 100+ sources into PostgreSQL, BigQuery, S3, or Snowflake
  • Provides pre-built source plugins for AWS, GCP, Azure, Kubernetes, GitHub, Okta, and many more
  • Enables SQL-based security queries like finding public S3 buckets or unencrypted volumes
  • Supports incremental syncs so only changed resources are updated on subsequent runs
  • Ships with policy packs for CIS benchmarks and compliance frameworks

Architecture Overview

CloudQuery uses a plugin-based architecture with source plugins and destination plugins connected by a high-performance gRPC streaming protocol. Each source plugin authenticates with a cloud API, fetches resources according to the configured table list, and streams rows to the destination plugin which writes them to the target database. Plugins run as separate processes, enabling language-agnostic development. The CLI orchestrates plugin lifecycle, schema migrations, and sync scheduling.

Self-Hosting & Configuration

  • Install the cloudquery CLI via Homebrew, package managers, or download from GitHub releases
  • Create a YAML config file defining source plugins (cloud providers) and destination plugins (databases)
  • Configure authentication using standard cloud credentials (AWS profiles, GCP service accounts)
  • Select specific tables to sync for faster, targeted data extraction
  • Schedule syncs with cron or integrate into CI/CD pipelines for continuous asset inventory

Key Features

  • 100+ source plugins covering major cloud providers, SaaS platforms, and infrastructure tools
  • High-performance Go-based sync engine that handles millions of resources efficiently
  • SQL-native approach lets you query infrastructure data with standard SQL joins and aggregations
  • Pre-built compliance and security policy packs for CIS AWS, GCP, and Azure benchmarks
  • Extensible plugin SDK for building custom source or destination plugins in Go or Python

Comparison with Similar Tools

  • Steampipe — SQL-based cloud querying with live API calls per query, while CloudQuery syncs data to a persistent database for faster repeated queries
  • Prowler — focused on security checks and compliance scoring, not general-purpose asset inventory
  • AWS Config — native AWS service for resource tracking but limited to AWS and expensive at scale
  • Cartography (Lyft) — Neo4j-based infrastructure graph but narrower source coverage and less active
  • Resoto (Some Engineering) — cloud asset inventory with graph model but smaller community and plugin ecosystem

FAQ

Q: How often should I run CloudQuery syncs? A: Most teams run syncs every 1-4 hours for security use cases. For compliance snapshots, daily or weekly syncs are common. Incremental mode reduces sync time significantly.

Q: Does CloudQuery support multi-cloud environments? A: Yes. Configure multiple source plugins in the same config file. AWS, GCP, and Azure resources land in the same database, enabling cross-cloud SQL queries.

Q: Can I write custom source plugins? A: Yes. CloudQuery provides an SDK for building source plugins in Go or Python. The plugin communicates with the CLI via gRPC and follows a standard table/column schema.

Q: What databases does CloudQuery support as destinations? A: PostgreSQL, BigQuery, Snowflake, S3 (Parquet/CSV), ClickHouse, Elasticsearch, Apache Kafka, and more via destination plugins.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产