Scripts2026年4月30日·1 分钟阅读

Protocol Buffers — Language-Neutral Data Serialization by Google

Protocol Buffers (protobuf) is Google's language-neutral, platform-neutral mechanism for serializing structured data, used extensively in gRPC and microservice communication.

Introduction

Protocol Buffers (protobuf) is a data serialization format created by Google for inter-service communication. It defines a schema in .proto files and generates strongly typed code for over a dozen languages, producing compact binary payloads that are smaller and faster to parse than JSON or XML.

What Protocol Buffers Does

  • Defines structured data schemas in a language-agnostic .proto IDL
  • Generates serialization and deserialization code for C++, Java, Python, Go, Rust, and more
  • Produces compact binary encoding that reduces payload size compared to text formats
  • Supports schema evolution with backward and forward compatibility via field numbering
  • Serves as the default wire format for gRPC remote procedure calls

Architecture Overview

A .proto file declares message types with numbered fields and scalar or composite types. The protoc compiler reads these definitions and, through language-specific plugins, emits source code containing builder, accessor, and codec methods. At runtime the generated code serializes objects into a tag-length-value binary format and deserializes them back, skipping unknown fields for forward compatibility. Reflection and descriptor APIs allow dynamic inspection of schemas at runtime.

Self-Hosting & Configuration

  • Install protoc from the official GitHub releases or via system package managers
  • Place .proto files in a shared repository or use Buf Schema Registry for team workflows
  • Use protoc-gen-go, protoc-gen-python, or other plugins for target language output
  • Integrate protoc into build pipelines via Bazel rules, Gradle plugins, or Makefiles
  • Adopt Buf CLI for linting, breaking-change detection, and code generation management

Key Features

  • Compact binary encoding is 3-10x smaller and 20-100x faster than XML
  • Strongly typed code generation catches schema mismatches at compile time
  • Field numbering enables adding or removing fields without breaking existing clients
  • First-class support in gRPC for building high-performance RPC services
  • Mature ecosystem with editions, proto2, and proto3 syntax variants

Comparison with Similar Tools

  • JSON — human-readable but larger payloads and no schema enforcement
  • MessagePack — binary JSON, compact but lacks schema and code generation
  • Apache Avro — schema-embedded format popular in Hadoop, uses JSON schemas
  • FlatBuffers — zero-copy access for game engines but more complex API
  • Cap'n Proto — zero-copy with RPC support but smaller ecosystem than protobuf

FAQ

Q: Can I use protobuf without gRPC? A: Yes. Protobuf is a standalone serialization library. gRPC uses it as the default codec, but you can serialize protobuf messages to files, queues, or any transport.

Q: Is protobuf human-readable? A: The binary format is not human-readable. Use protoc --decode or the text-format representation for debugging. For human-readable needs, JSON mapping is supported.

Q: How does schema evolution work? A: Each field has a unique number. New fields can be added and old ones removed without breaking existing code, as long as field numbers are not reused.

Q: Which languages are supported? A: Official support includes C++, Java, Python, Go, C#, Ruby, Objective-C, PHP, Dart, and Kotlin. Community plugins cover Rust, Swift, TypeScript, and others.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产