Protocol Buffers


Protocol Buffers is a method of serializing structured data. It is useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.

Overview

developed Protocol Buffers for internal use and provided a code generator for multiple languages under an open source license.
The design goals for Protocol Buffers emphasized simplicity and performance. In particular, it was designed to be smaller and faster than XML.
Protocol Buffers are widely used at Google for storing and interchanging all kinds of structured information. The method serves as a basis for a custom remote procedure call system that is used for nearly all inter-machine communication at Google.
Protocol Buffers are similar to the Apache Thrift or Microsoft Bond protocols, offering as well a concrete RPC protocol stack to use for defined services called gRPC.
Data structures and services are described in a proto definition file and compiled with protoc. This compilation generates code that can be invoked by a sender or recipient of these data structures. For example, example.pb.cc and example.pb.h are generated from example.proto. They define C++ classes for each message and service in example.proto.
Canonically, messages are serialized into a binary wire format which is compact, forward- and backward-compatible, but not self-describing. There is no defined way to include or refer to such an external specification within a Protocol Buffers file. The officially supported implementation includes an ASCII serialization format, but this format—though self-describing—loses the forward- and backward-compatibility behavior, and is thus not a good choice for applications other than debugging.
Though the primary purpose of Protocol Buffers is to facilitate network communication, its simplicity and speed make Protocol Buffers an alternative to data-centric C++ classes and structs, especially where interoperability with other languages or systems might be needed in the future.

Example

A schema for a particular use of protocol buffers associates data types with field names, using integers to identify each field.

//polyline.proto
syntax = "proto2";
message Point
message Line
message Polyline

The "Point" message defines two mandatory data items, x and y. The data item label is optional. Each data item has a tag. The tag is defined after the equal sign. For example, x has the tag 1.
The "Line" and "Polyline" messages, which both use Point, demonstrate how composition works in Protocol Buffers. Polyline has a repeated field, which behaves like a vector.
This schema can subsequently be compiled for use by one or more programming languages. Google provides a compiler called protoc which can produce output for C++, Java or Python. Other schema compilers are available from other sources to create language-dependent output for over 20 other languages.
For example, after a C++ version of the protocol buffer schema above is produced, a C++ source code file, polyline.cpp, can use the message objects as follows:

// polyline.cpp
  1. include "polyline.pb.h" // generated by calling "protoc polyline.proto"
Line* createNewLine
Polyline* createNewPolyline

Language support

proto2 provides a code generator for C++, Java, C#, and Python.
Third-party implementations are also available for JavaScript.
proto3 provides a code generator for C++, Java, Python, Go, Ruby, Objective-C and C#. Since 3.0.0 Beta 2 support for JavaScript.
Third-party implementations are also available for C, Dart, Haskell, Perl, PHP, R, Rust, Scala, Swift and Julia.