Cette page est affichée en anglais. Une traduction française est en cours.
ConfigsApr 29, 2026·3 min de lecture

Protocol Buffers — Language-Neutral Data Serialization by Google

Protocol Buffers (protobuf) is Google's language-neutral, platform-neutral mechanism for serializing structured data. It is smaller, faster, and simpler than XML or JSON for inter-service communication and data storage.

Introduction

Protocol Buffers (protobuf) is a data serialization format developed by Google for internal RPC systems. It uses a schema definition language (.proto files) to describe data structures, then generates efficient serialization code for C++, Java, Python, Go, C#, and many other languages. Protobuf is the default wire format for gRPC.

What Protocol Buffers Does

  • Defines data structures in .proto schema files with strong typing
  • Generates serialization and deserialization code for 10+ languages
  • Encodes data into a compact binary format that is 3-10x smaller than JSON
  • Supports schema evolution with backward and forward compatibility
  • Powers gRPC as the default serialization layer for RPC communication

Architecture Overview

Protobuf uses a two-phase workflow. First, developers define message types in .proto files using a compact IDL. The protoc compiler then generates language-specific classes with serialization methods. At runtime, data is encoded using a tag-length-value binary format where each field is identified by its number, enabling efficient parsing and schema evolution without breaking existing consumers.

Self-Hosting & Configuration

  • Install protoc from GitHub releases or via package managers
  • Write .proto files in proto3 syntax for modern projects
  • Generate code with language-specific plugins: protoc --java_out=. --go_out=. schema.proto
  • Use buf (bufbuild/buf) for linting, breaking change detection, and dependency management
  • Integrate with build systems via Bazel rules, Gradle plugins, or CMake modules

Key Features

  • Binary encoding is 3-10x smaller and 20-100x faster to parse than JSON or XML
  • Schema evolution lets you add or remove fields without breaking existing clients
  • Code generation eliminates manual serialization and reduces bugs
  • First-class support in gRPC for high-performance RPC across languages
  • Well-Known Types provide standard definitions for timestamps, durations, and wrappers

Comparison with Similar Tools

  • FlatBuffers — Zero-copy access without parsing; better for latency-critical paths like games
  • Apache Thrift — Similar IDL-based approach with built-in RPC; broader transport options
  • MessagePack — Schema-less binary format; simpler but no code generation or type safety
  • Cap'n Proto — Zero-copy like FlatBuffers with an RPC system; smaller community
  • JSON — Human-readable and universal; significantly larger and slower for high-throughput systems

FAQ

Q: Should I use proto2 or proto3? A: Use proto3 for new projects. It has a simpler syntax, removes required fields, and is the default for gRPC.

Q: Can I convert between protobuf and JSON? A: Yes. Most protobuf libraries include JSON serialization. The canonical mapping is defined in the protobuf spec.

Q: How do I handle schema changes safely? A: Never reuse field numbers. Add new fields with new numbers. Use reserved to prevent accidental reuse of removed fields.

Q: Is protobuf suitable for long-term storage? A: Yes, as long as you manage schema evolution carefully. The binary format is stable and self-describing when combined with FileDescriptorSet.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires