117

I have an object that I de-serialize using protobuf in Python. When I print the object it looks like a python object, however when I try to convert it to json I have all sorts of problems.

For example, if I use json.dumps() I get that the object (the generated code from protoc) does not contain a _ dict _ error.

If I use jsonpickle I get UnicodeDecodeError: 'utf8' codec can't decode byte 0x9d in position 97: invalid start byte.

Test code below is using jsonpickle with the error shown above.

if len(sys.argv) < 2:
    print ("Error: missing ser file")
    sys.exit()
else :
    fileLocation = sys.argv[1]

org = BuildOrgObject(fileLocation) 

org = org.Deserialize()


#print (org)
jsonObj = jsonpickle.encode(org)
print (jsonObj)
1
  • 3
    This would be way easier to figure out if you showed us the relevant parts of your .proto file and the implementation of BuildOrgObject(). If we can reproduce the behavior you're seeing, it's much easier for us to figure out what's wrong. Commented Nov 1, 2013 at 20:59

7 Answers 7

251

I'd recommend using protobuf↔json converters from google's protobuf library:

from google.protobuf.json_format import MessageToJson

json_obj = MessageToJson(org)

You can also serialise the protobuf to a dictionary:

from google.protobuf.json_format import MessageToDict
dict_obj = MessageToDict(org)

Refer to the protobuf package API documentation: https://developers.google.com/protocol-buffers/docs/reference/python/ (see module google.protobuf.json_format).

Sign up to request clarification or add additional context in comments.

18 Comments

This doesn't seem to be available in version 3.2?
I've done a quick test (clean venv, install protobuf, try to import MessageToJson) and it seems to be available. Python 3.6.
MessageToJson(org, preserving_proto_field_name=True) if you don't want your field_name converted into fieldName
It doesn't work for me. I get the error: AttributeError: 'Schema' object has no attribute 'DESCRIPTOR'
If you are running this on a repeated sub field (instead of on a regular message) then it will fail with missing DESCRIPTOR. Use the main message, or convert to dict each of the elements and combine
|
17

If you need to go straight to json take a look at the protobuf-to-json library, but you'll have to install that manually.

But I would recommend that you use the protobuf-to-dict library instead for a few reasons:

  1. It is accessible from pypi so you can simply pip install protobuf-to-dict or include it in a requirements.txt
  2. dict can be converted to json and might be more useful than a json string

6 Comments

protobuf-to-dict feels more pythonic than google.protobuf.json_format
There is now a method include with protobuf for converting to a dict instead of json. It is called: google.protobuf.json_format.MessageToDict
Requires implementing MyMessage to get things going. Kind of incomplete since thats a big part of parsing a proto?
That's a limitation of protobuf itself. Unlike something like json, a protobuf message cannot describe itself, and a schema must be created in order to either encode or decode a message. Without a schema, it is just a byte blob without enough information to parse.
@Stian - try the updated version: pip install protobuf3-to-dict
|
3

Here's my function to convert a proto3 object to a JSON object (i.e. Python dictionary):

def protobuf_to_dict(proto_obj):
    key_list = proto_obj.DESCRIPTOR.fields_by_name.keys()
    d = {}
    for key in key_list:
        d[key] = getattr(proto_obj, key)
    return d

Since the converters from Google's protobuf library don't seem to work in some cases with the 3.19 version, this function leverages the Descriptor class present on each Protobuf object.

Here, getattr(obj, string_attribute) returns the value given by obj.attribute

Comments

2

You can also user SerializeToString.

org.SerializeToString()

1 Comment

This does not return the JSON representation of the Protobuf message. It returns the Protobuf-encoded serialized messages as bytes (see the documentation).
1

In my case I was trying to convert google vision api response protobuf to dict.

You can try:

from google.protobuf.json_format import MessageToDict


(Pdb) type(response)
<class 'google.cloud.vision_v1.types.image_annotator.BatchAnnotateFilesResponse'>

print(MessageToDict(response._pb))
{'responses': [{'responses': [...]},...]}

If you try MessageToDict with just the response you'll get a missing DESCRIPTOR error

Comments

0

If you are using an older version that doesn't has the preserving_proto_field_name field:

from google.protobuf.json_format import MessageToJson
def proto_to_json(proto_obj):
    json_obj = MessageToJson(proto_obj):
    json_obj = MessageToJso, including_default_value_fields=True)
    # Change lowerCamelCase of google Json conversion to the snake_case as in original protobuf
    dict_obj = dict((re.sub(r'(?<!^)(?=[A-Z])', '_', k).lower(),v) for k, v in json.loads(json_obj).items())
    if hasattr(proto_obj, 'uuid'):
        dict_obj["uuid"] = proto_obj.uuid.encode("hex")
    return json.dumps(dict_obj, indent=4, sort_keys=True)

Comments

0
#!/usr/bin/python

from google.transit import gtfs_realtime_pb2
from google.protobuf.json_format import MessageToDict, MessageToJson
import requests

feed = gtfs_realtime_pb2.FeedMessage()
response = requests.get('GTFS resource URL')
feed.ParseFromString(response.content)

print(MessageToJson(feed))

This returns proper JSON representation of a remote GTFS resource.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.