Configs2026年4月18日·1 分钟阅读

Feast — Open Source Feature Store for Machine Learning

Feast is an open-source feature store that manages and serves machine learning features for training and inference. It bridges the gap between data engineering and ML by providing a consistent feature retrieval layer backed by offline and online stores.

Introduction

Feast solves the problem of feature management in ML systems by providing a registry of feature definitions and a serving layer that delivers consistent feature values to both training pipelines and online inference. It eliminates training-serving skew by ensuring the same transformation logic produces features for both paths.

What Feast Does

  • Defines features as code in Python using a declarative feature view API
  • Materializes features from offline stores (BigQuery, Snowflake, files) to online stores (Redis, DynamoDB, SQLite)
  • Serves features at low latency for real-time model inference
  • Generates point-in-time correct training datasets to avoid data leakage
  • Tracks feature lineage and metadata in a central registry

Architecture Overview

Feast uses a three-layer architecture: an offline store for historical feature retrieval, an online store for low-latency serving, and a registry that holds feature definitions and metadata. The feast apply command pushes feature definitions to the registry. Materialization jobs read from the offline store and write to the online store. A Python SDK or a Go-based feature server handles online serving requests.

Self-Hosting & Configuration

  • Install via pip and initialize a project with feast init
  • Define data sources and feature views in feature_store.yaml and Python files
  • Configure offline store: BigQuery, Snowflake, Redshift, Spark, or file-based
  • Configure online store: Redis, DynamoDB, PostgreSQL, SQLite, or Datastore
  • Run the feature server with feast serve for HTTP-based online retrieval

Key Features

  • Point-in-time joins prevent future data leaking into training sets
  • Supports both batch and streaming feature sources
  • Go-based feature server delivers sub-millisecond online serving
  • Feature registry provides discovery, documentation, and lineage
  • On-demand feature transformations compute features at request time

Comparison with Similar Tools

  • Tecton — managed feature platform built by Feast creators; Feast is the open-source self-hosted option
  • Hopsworks — full ML platform with built-in feature store; Feast is more lightweight and modular
  • Databricks Feature Store — tightly integrated with Databricks; Feast is cloud-agnostic
  • SageMaker Feature Store — AWS-native; Feast works across any cloud or on-premises
  • Featureform — virtual feature store with provider abstraction; Feast materializes data directly

FAQ

Q: Does Feast transform raw data into features? A: Feast supports on-demand transformations for simple logic. For complex transformations, pre-compute features in your data pipeline and register the output with Feast.

Q: Can Feast handle real-time streaming features? A: Yes. Feast can ingest from streaming sources like Kafka and push features directly to the online store.

Q: What online stores does Feast support? A: Redis, DynamoDB, PostgreSQL, SQLite, Datastore, Bigtable, and more via community plugins.

Q: Is Feast production-ready? A: Yes. Feast is used in production at companies running low-latency inference serving millions of requests.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产