gRPC and Protocol Buffers as an alternative to JSON REST APIs

Introduction

gRPC is an open-source remote procedure call framework and Protocol Buffers is a mechanism for serializing structured data. Both were developed by Google and are used in their internal and public APIs. Other big players such as Cisco and Netflix already benefit from this technology for mission critical applications.

In this post, we will learn the core features of gRPC and Protobuf, and compare to JSON REST APIs.

gRPC

A remote procedure call framework is used when applications running in different processes need to exchange data. It shouldn't matter whether these are running in the same or different machines, or one is running in the cloud and the other in a desktop/mobile client. The gRPC framework can accomplish this with great performance and it's available in many languages, such as Java, C++, Python, Ruby, Go, and more.

gRPC uses HTTP/2 as its base transport protocol. It benefits from the multiplexing feature of HTTP/2 to execute requests/responses in parallel using a single TCP connection, reducing the hardware resources usage in both client and server compared when comparing to HTTP 1.1.

It also supports bi-directional streaming, which makes it capable of handling long communication patterns such as chat and file download/upload split in chunks. These use cases are often implemented with WebSockets, Server-Sent Events and Chunked transfer encoding.

Protocol Buffers

Protobuf, for short, is a language-independent mechanism for defining structured data. You can think of it as format to define message structures, such as XML Schema Definition (XSD) and JSON Schema.

Protocol Buffers is composed of multiple pieces:

  • Message structure language: this is how message types and attributes are defined, which is somewhat similar to MessagePack, XML Schema Definition (XSD) and JSON Schema.

  • Compiler: a tool that translates the message structures that you defined into code of your target programming language(s).

  • Library: a runtime dependency that you must include in your application to perform marshaling and unmarshaling of Protobuf messages.

Although it's possible to use gRPC with other formats, Google has chosen Protobuf as the preferred format for several reasons - let's check out some of them:

  • It is strongly-typed, which reduces cryptic errors caused by sending/receiving data wrapped in unexpected data types, e.g. integer number in a string.

  • Marshaling and unmarshaling data into a binary wire format is faster than XML and JSON, which increases application response time and reduces hardware usage.

  • The binary wire format is very compact, reducing bandwidth and memory usage. Because the binary doesn't carry any metadata such as attribute names, it is easy to see how more efficient it is compared to other data serialization formats.

  • It's forward- and backward-compatible, which means that applications using different versions of a Protobuf message can marshal and unmarshal the binary format to an newer/older Protobuf message structure.

Hello World in Protocol Buffers

Let's see a quick example of defining a message type and an RPC service:

// File: api.protos
message User {
  int32 id = 1;
  string email = 2;
  string password = 3;
}

service UserService {
  rpc SignUp (User) returns (User);
}

The User message type specifies the attributes that represent a user in this API. The UserService defines an interface to which a client can interact with the server, which in this case specifies a SignUp endpoint that can be used to register a user in the server. A few more things to notice:

  • Each attribute must have a type - in this example, we used 32 bits integer and a string.

  • The numbers in the end of each attribute of the User message are numbered tags, which are used to identify each field in the protobuf binary format. Because the binary format doesn't contain metadata, this is how protobuf can marshal and unmarshal messages of different versions of the API.

  • The SignUp endpoint has two references to User - one is the input parameter received from the client, and the other is the return from the endpoint. For simplicity, I used the same message, which means the server will just echo the User message with the generated id filed out.

Protocol Buffers vs JSON

As we mentioned before, JSON and protobuf are data representation formats. Here are the main differences between them:

  • A protobuf message is translated to a compact binary format before transmission, while JSON is typically transmitted encoded as UTF-8. The main advantage of protobuf here is the message size, which is guaranteed to be smaller than JSON.

  • Protobuf enforces that messages follow certain schema definition before transmitting. There are two advantages here: 1) a minimum data validation occurs automatically in the client when it builds a protobuf message, and 2) the .proto file can be distributed to clients as a reference for all messages structures used in the API, which reduces the efforts of maintaining an API documentation.

  • Protobuf supports more attribute types, such as enums, bytes, duration, timestamp and user-defined types.

You might have noticed that our protobuf message example from the previous section can be easily translated to JSON:

{
  "type": "user",
  "attributes": {
    "id": <some integer>,
    "email": "<some string>",
    "password": "<some string>"
  }
}

We're accustomed to communicate JSON message types by providing examples, like the one above. Although JSON Schema is intended to solve that problem. Because that is optional and requires more work to use as validator in both server and client, not many projects end up using it.

gRPC vs REST API

In gRPC, services and methods are the terms used when defining the interface of an API. A service is essentially a collection of methods that can be invoked remotely. Those concepts are very similar to how we define endpoints related to one resource in a REST API. Consider the following HTTP endpoints:

 GET /posts
POST /posts
 PUT /posts/:id

The gRPC equivalent to this API is:

service PostService {
  rpc List   (Void) returns (stream Post);
  rpc Create (Post) returns (Post);
  rpc Update (Post) returns (Post);
}

The service definition is used to generate a method stub, which is a piece of code used by the client and the server to build and validate requests/responses.

When building REST APIs, the client/server are 100% responsible for validating the payloads. With gRPC, some validation is given for free because the payloads need to comply with pre-defined message types before the request/response are sent. This is beneficial because it enforces a basic quality level in all endpoints of an API, as well as reduces the amount of boilerplate code.

What's next?

I hope you now have a good idea on how gRPC can be a great tool for building your next API. Big players have already adopted this technology, which have contributed to numerous contributions to its ecosystems, such as other libraries that improve the experience in the supported languages.

Check out the official website https://grpc.io for more examples and reference docs!