Solution for serialization and transmission of messages on limited-bandwidth channels
In this article we present a solution that we have implemented for serializing messages over a limited-bandwidth channel. This mechanism is based on the Protocol Buffers tool, developed by Google.
The Problem
Suppose we have a system that foresees the exchange of highly important data packets from the field to a centralized application via limited bandwidth communication channels, where a limited number of bytes can be sent for each packet.
Furthermore, suppose that the data packets exchanged are of different types, and that some types of messages have variable size fields, such as text strings.
In this scenario, it is likely that some messages have a length that exceeds the maximum size that can be sent on the communication channel.
To be able to send all the possible messages created by the field or by the centralized application, it is essential to find a serialization modality that provides a certain degree of message compression and, furthermore, that can also handle the case in which the serialized message has a greater size than the transmission limit.
The Solution
To solve the problem described in the previous point, we chose to use Protocol Buffers, a binary serialization tool developed by Google which is capable of serializing typed and structured data, obtaining a simple array of bytes as output.
The choice fell on this tool for a series of reasons.
First of all, it is an open-source tool developed by Google, modern and currently widely used in various fields and applications due to its high flexibility and compact output size, compared to other serialization mechanisms such as JSON.
Protocol Buffers offers serialization/deserialization methods that are much more powerful than other technologies such as JSON or others.
For example, the image to the left compares the serialization and deserialization processes of a message using JSON and Protocol Buffers, and compares the time taken to perform a single operation.
Among the other advantages considered for this technology, we point out that Protocol Buffers is platform-independent, and supports a series of different languages such as C++, Java, C#, Kotlin, Python, etc…
Finally, this mechanism works on typed and structured data, using self-generated classes based on description files that illustrate the structure of each type.
These description files have an extension .proto
and contain the forma declaration of each message type. Based on the .proto necessary files, .proto
Protocol Buffers provides a compiler that generates the respective classes in the supported languages.
Example
Example .proto file definition
Suppose, for example, that we need to serialize a class Person with Protocol Buffers that contains some information about a person:
- Name
- Last name
- Address
- Telephone number
The first step is to create a file Person.proto that describes this data structure (to the right).
// Protocol Buffer declaration
syntax = "proto3";
package tutorial;
// Java namespace declaration
option java_package = "com.example.tutorial.protos";
message Person {
uint32 ID = 1;
string NAME = 2;
string SURNAME = 3;
string ADDRESS = 4;
string EMAIL = 5;
string TEL = 6;
}
As we see, it is possible to describe the structure of the object using a series of types and setting an ordering/numbering of its fields. The supported types, as well as the syntax, are described in the official documentation.
In particular, the data types made available by Protocol Buffers and the reference to the corresponding type in each of the supported languages are detailed.
Optional fields, lists, generic types and enumerations, and nesting of types within other message types are also provided.
Protocol Buffers Generation Classes
After writing a .proto file, the next step, to be able to use the class within a project, is to generate the class in the required language. You must, therefore:
Download the ProtoBuf compiler from the Protocol Buffers repository
Launch the compiler indicating the proto file as input, the destination of the output file and the target language
For example, to obtain a Java class linked to the file Person.proto presented above, simply launch the command:
protoc -I=$SRC_DIR –java_out=$DST_DIR $SRC_DIR/Person.proto
where $SRC_DIR indicates the base folder of the project and –java_out=$DST_DIR specifies to generate a class in the Java language and save it in the path $DST_DIR.
Serialization and reduced transmission size
However, returning to the problem illustrated at the beginning of the article, the use of Protocol Buffers (as well as any other serialization process) does not solve the constraint relating to the maximum size that can be sent for each message.
For this reason it was necessary to think of a mechanism that not only used Protocol Buffer for serialization, but which also included fragmentation and reconstruction operations of a single message into multiple blocks that could respect the transmission limits imposed by the technology.
We therefore thought of this solution, a single message is broken into many fragments based on the maximum size that can be sent on the channel and taking into account a number of header bytes added on to each packet..
Content of Header Bytes
These header bytes are necessary because the message, once fragmented, must also be reconstructed on the other side, and for this operation a series of information useful for the reconstruction and recognition of these particular packets is necessary.
- The information we have provided is:
- A series of special ASCII characters as initiators of each packet, used to discriminate these packets from other communications sent
- An incremental ID that groups together all the fragments that are part of a single message
- The length of the fragment body in bytes
- The number of individual fragments (in the case of fragmentation into N packets, from 0 to N-1)
- The total number of fragments
- The reference class of the message that was serialized via Protocol Buffers
One of the challenges encountered in the definition of this protocol was the choice of how to use the header bytes, in order to reduce the number of bytes reserved for storing this accessory information by as much as possible.
At the same time, the header bytes must be sufficiently large enough to guarantee the possibility of fragmenting messages of different types without having too stringent of constraints on the maximum number of fragments that can be created.
Considerations
Through this procedure we were able to transmit variable length information over a communication channel with constraints on the maximum payload size.
Using Protocol Buffers we were able to execute performant and optimized serialization operations while maintaining the level of abstraction offered by the .proto description files.
Finally, through our fragmentation mechanism, we overcame the constraint on the maximum size that can be transmitted on the communication channel.
Use Cases
We used this system to exchange messages, communicating over a TETRA radio network, between Android devices in the field and a .NET Core application in cloud
TETRA – TErrestrial Trunked RAdio – is a professional two-way cellular radio system standardized by ETSI.
TETRA technology has been specifically designed for use by public administrations, emergency services, (police forces, firefighters, healthcare services, etc.) for public safety networks, for railway transportation and airport personnel as well as for military services.
Since they are aimed at these types of users,TETRA networks provide a high level of reliability and a series of connection modes to cover the greatest number of situations, including a Direct Mode which allows multiple radio devices to communicate with each other in the event of infrastructure failures.
Specifically, the TETRA radio network with which we interfaced limits the size of the messages exchanged to a maximum of 255 bytes. Since the objective was to allow different types of messages with a larger size to transit, we used this serialization system, successfully verifying its performance and effectiveness.