protobuf + MQTT is yummy fast goodness

MQTT is very fast, very efficient.  Payload size & speed matters too though.  XML is too heavy and slow for mobile.  JSON is much better.  But for absolutely smallest wire size and fastest serialization you need binary.  The most obvious and mature solution is Google Protocol Buffer aka “protobuf”.

Benchmarks of JSON vs protobuff vary but most say protobuf objects are 2x smaller and serialization/deserialization is 2.5 – 4x faster.

I have “quite large” projects going on using the protobuf+MQTT combo in situations where the requirement is pushing events from 100s of sensors over 1 mobile connection and/or optimizing for lowest latency.

JSON and protobuf have their pros and cons.  So be sure you make the right choice depending on your requirements.  My suggestion for a good rule of thumb is protobuf for things, JSON for people i.e. mobile apps, HTML5 since UX dev tools all support JSON.

There are (prototype?) protobuf JavaScript implementations for example ProtoBuf.js if you want to use it in HTML5 apps.  No warranty expressed nor implied since I’ve not yet tested those.

Commentary and benchmarks:

Some thoughts from others:


·       human readable/editable

·       can be parsed without knowing schema in advance

·       excellent browser support

·       less verbose than XML (relatively small on the wire)


·       very dense data (very small on the wire)

·       hard to robustly decode without knowing the schema (data format is internally ambiguous, and needs schema to clarify)

·       very fast processing

·       not intended for human eyes (dense binary)

Protocol buffers

Protocol buffers are binary and quite compact. You can immediately use the data you’ve received by just pointing to the right portions of a buffer (“zero-copy”), after a very light parsing. If you pass a lot of numbers, it may be quite noticeable. Zero-copy makes most sense for C or C++ code, though; in Java, Python, etc numbers and strings will probably change representation to fit the language.


JSON is a textual format. It is easy to parse efficiently, and you can even have zero-copy strings, but everything else should be parsed. JSON may become wasteful if you pass a lot of mappings with the same keys ([{“foo”:1, “bar”:2}, {“foo”:3, “bar”:4}]), but compression before transmission (mod_gzip, etc) may eliminate this problem.


XML has a very wasteful representation, but it has well-established verification and transformation tools. E.g. using XML Schema, one can express and check quite complex constraints. It also has a standard transformation language, XSLT, nicely homoiconic, but its syntax is totally not intended for humans.

My take

If I were to implement two communicating services, I’d take JSON.

JSON is the simplest of the three. JSON is easy to check by eye and write by hand. This is very important during debugging (and distributed systems rarely work right the first time).

I might consider protocol buffers if at least one end of the communication was performance-critical (presumably written in Java, or Go, or some other high-performance language) and I saw a clear performance problem in JSON-related code. Unless you have one Java backend communicatind with hundreds of Perl frontends, and a few microseconds of latency are important for you, I don’t think you will see significant performance difference between JSON and protocol buffers.

Unless there was a damn good reason, I won’t consider XML at all.

There are good benchmarks from the Java community on serialization/deserialization and wire size of these technologies:

In general, JSON has slightly larger wire size and slightly worse DeSer, but wins in ubiquity and the ability to interpret it easily without the source IDL.

protocol buffers is designed for the wire:

1.     very small message size – one aspect is very efficient variable sized integer representation.

2.     Very fast decoding – it is a binary protocol.

3.     protobuf generates super efficient C++ for encoding and decoding the messages — hint: if you encode all var-integers or static sized items into it it will encode and decode at deterministic speed.

4.     It offers a VERY rich data model — efficiently encoding very complex data structures.

JSON is just text and it needs to be parsed. hint: enoding a “billion” int into it would take quite a lot of characters: Billion = 12 char’s (long scale), in binary it fits in a uint32_t Now what about trying to encode a double ? that would be FAR FAR worse.

Latency is a Driver Distraction issue

Latency is a Driver Distraction issue. Faster communications, faster decisioning saves lives.  There are many automakers and tier-1 suppliers working with me on that. Obviously there are also the other values it brings such as having a better driver experience and a better owner experience. For example being able from your smartphone to find my car, unlock my car, etc. with “key fob response time”, i.e. sub-second, faster than you lift your finger off the button.  Which is more impressive after you learn that the connected car systems out there today are typically yielding 15 to 90 second response times.  One luxury German brand for example, when you click the smartphone app to “find car” brings up a dialogue saying “Finding your car.  This may take a few minutes.”  And it does.

Somewhere you can see the alternative is the faster connected car driver experience that Sprint Velocity is showing at the upcoming LA Auto Show.