Protocol Buffers — Language-Neutral Data Serialization by Google

Introduction

Protocol Buffers (protobuf) is a data serialization format created by Google for inter-service communication. It defines a schema in .proto files and generates strongly typed code for over a dozen languages, producing compact binary payloads that are smaller and faster to parse than JSON or XML.

What Protocol Buffers Does

Defines structured data schemas in a language-agnostic .proto IDL
Generates serialization and deserialization code for C++, Java, Python, Go, Rust, and more
Produces compact binary encoding that reduces payload size compared to text formats
Supports schema evolution with backward and forward compatibility via field numbering
Serves as the default wire format for gRPC remote procedure calls

Architecture Overview

A .proto file declares message types with numbered fields and scalar or composite types. The protoc compiler reads these definitions and, through language-specific plugins, emits source code containing builder, accessor, and codec methods. At runtime the generated code serializes objects into a tag-length-value binary format and deserializes them back, skipping unknown fields for forward compatibility. Reflection and descriptor APIs allow dynamic inspection of schemas at runtime.

Self-Hosting & Configuration

Install protoc from the official GitHub releases or via system package managers
Place .proto files in a shared repository or use Buf Schema Registry for team workflows
Use protoc-gen-go, protoc-gen-python, or other plugins for target language output
Integrate protoc into build pipelines via Bazel rules, Gradle plugins, or Makefiles
Adopt Buf CLI for linting, breaking-change detection, and code generation management

Key Features

Compact binary encoding is 3-10x smaller and 20-100x faster than XML
Strongly typed code generation catches schema mismatches at compile time
Field numbering enables adding or removing fields without breaking existing clients
First-class support in gRPC for building high-performance RPC services
Mature ecosystem with editions, proto2, and proto3 syntax variants

Comparison with Similar Tools

JSON — human-readable but larger payloads and no schema enforcement
MessagePack — binary JSON, compact but lacks schema and code generation
Apache Avro — schema-embedded format popular in Hadoop, uses JSON schemas
FlatBuffers — zero-copy access for game engines but more complex API
Cap'n Proto — zero-copy with RPC support but smaller ecosystem than protobuf

FAQ

Q: Can I use protobuf without gRPC? A: Yes. Protobuf is a standalone serialization library. gRPC uses it as the default codec, but you can serialize protobuf messages to files, queues, or any transport.

Q: Is protobuf human-readable? A: The binary format is not human-readable. Use protoc --decode or the text-format representation for debugging. For human-readable needs, JSON mapping is supported.

Q: How does schema evolution work? A: Each field has a unique number. New fields can be added and old ones removed without breaking existing code, as long as field numbers are not reused.

Q: Which languages are supported? A: Official support includes C++, Java, Python, Go, C#, Ruby, Objective-C, PHP, Dart, and Kotlin. Community plugins cover Rust, Swift, TypeScript, and others.

Protocol Buffers — Language-Neutral Data Serialization by Google

Introduction

What Protocol Buffers Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

SweetAlert2 — Beautiful Responsive Accessible Popup Replacement

React Virtualized — Efficient Rendering of Large Lists and Tables in React

SortableJS — Reorderable Drag-and-Drop Lists for the Web