Protocol buffer is a technology made by google for automatic serialization of data to and from a compressed binary format. Essentially you define your data in a Iinterface Ddefinition Llanguage (IDL) and then generate bindings for any language from which you want to generate the data or consume data. It is very similar to Thrift in philosophy and function except that Thrift provides network transport as well as serialization.

As with all code generation frameworks the first problem we need to tackle is how to efficiently integrate the code generation into our build pipeline. There some important requirements for such an integration. First, the generated code should not be checked in manually or if possible it should not be checked in at all as this will lead to frustration later. Developers will forget to check in generated files and overwrite each others’ changes as generate code tends to be verbose and unfamiliar to the user. Second, the generated code should be available for debugging otherwise development around the generated code will be frustrating and slow. Third, incompatible changes to the source IDL files should break the build. Last, people who do not need to edit the IDL files should not have to generate the code locally as this will be another piece of tech they will maintain.

To support all of these requirements I use a Maven based build pipeline with a Nexus Sonatype repository for storing artifacts which can then be used for all developers in their local builds. This tutorial will give a step by step guide to setting up such a project. Note this tutorial is only compatible with Linux/Unix systems or Cygwin if you are on windows

Tools of the trade

Before going further you will need to install the following tools.

Java 6 SDK. The first step is to download and install Java in this tutorial we will be generating code for Java and C++.

Maven 3. Next we download and install Apache Maven 3. The installation process is simple enough and I won’t get into the details. Please feel free to ask questions in the comments or the forum if you get stuck.

# Run this command to check the correct version of Maven is installed, in the path
$mvn  --version
# It should echo the following line among other output.
Apache Maven 3.X.X ....


Nexus Repository. You will need to setup a nexus repository in order to share the packaged, generated sources between developers. I have created a simple tutorial on setting up a Nexus repository Running Nexus Sonatype over Jetty</a>

Protocol Buffer Compiler. You will need to install the protocol buffer compiler which can be downloaded here. Detailed installation instructions for the protocol buffer compiler can be found here. However, the basic steps are to untar the package, browse to the directory in the terminal or Cygwin and run the following commands.

sudo make install
protoc --version


Maven Protoc Plugin In order to compile protocol buffer you would need to compile and install the Maven Protoc Plugin (Full disclosure: I am a contributor to the plugin). The source is available on Github. Once you have the source you can run the following commands to compile the plugin.

cd maven-protoc-plugin
mvn clean install


Creating maven project

We will be keeping our protocol buffer IDL files in a maven project which will be deployed to the Nexus repository we just setup. We keep our IDL files a little maven project of its own to ensure the four requirements for integration that we specified in the start of this tutorial. I will highlight how we fulfill each requirement as we go. To create a simple Maven java project run the following command:

   mvn archetype:generate                         \
	  -DgroupId=com.flybynight.protobuff          \
	  -DartifactId=protocompiler                  \
	  -DarchetypeArtifactId=java-1.6-archetype    \
	  -DarchetypeVersion=0.0.2                    \
	  -DarchetypeGroupId=net.avh4.mvn.archetype   \


This will create a base project for you called “protocompiler” if you cd to the newly created directory you should be able to see that it already contains a src folder and a pom.xml file.

cd protoccompiler
ls -l
-rw-r--r--  1 usman  staff  1637  5 Sep 19:02 pom.xml
drwxr-xr-x  4 usman  staff   136  5 Sep 19:02 src


Writing your Proto Files

cd to src/main/resources and create a new file called hello.proto and add the following text:

message HelloWorld {
  required string message = 1;


We can test out protocol buffer compiler installation and that our file is valid by running the command shown below. It will generate the file.

protoc -I=./ --java_out=./ hello.proto
ls -l


Generating Source Files

Now we get to the automatic generation of the source files which we do using the maven-protoc-plugin we installed earlier. Open the pom file at protoccompiler/pom.xml, look for the “dependencies” element and add the following dependency to pull in protocol buffer support files.



Now look for the “plugins” element and add the plugin definition below. We are invoking the “maven-protoc-plugin” that we compiled earlier in the tutorial by specifying its groupId and artifactId. In the configuration section we are specifying the protocol buffer compiler binary using “protocExecutable”. Note that this assumes that protoc is in the path. If this is not the case you can specify the fully qualified path to the binary. Using “protoSourceRoot” we are specifying the location of the proto files. The plugin looks for files in the specified directory with the “.proto” extension. We are then specifying that we wish to generate sources for Java and C++ using the “JAVA” and “CPP” constants respectively. For each language we also specify a output directory where the generated source will be placed.



Once you make the changes save the pom file and compile the project using the mvn clean install command. You should see the generated sources in target/generated-sources/java and target/generated-sources/cpp. Furthermore the generated jar file contains the generated java sources as compiled class files. This fulfills our first requirement that the generated files should not have to be checked in (as they will be available in the jar file).

Attach source

To fulfill our requirement of generated code being easy to debug we will attach a source jar to our artifact. We can do this by adding the maven source plugin to our pom file as shown below.



Deploying to Nexus

To ensure our last two requirements of breaking the build for incompatible changes but not requiring everyone to build locally we need to deploy our code to the nexus repository. To do this we will create a settings.xml with the data shown below and run mvn clean install –settings ./settings.xml.

<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="" 



Pulling from nexus

Now we can just setup a maven project to pull in the generated code by adding the dependency shown below. This fulfills our last two requirements, if there is an incompatible change it will break the build for the dependent project without requiring people to build proto files themselves.