1

What is the best possible way to convert my data to DataFrame?

    data = b'{"word": "Gondwana", "date": "2019-03-27 13:07:12.404732"}'
           b'{"word": "alalus", "date": "2019-03-27 13:07:12.909517"}'
           b'{"word": "Balto-Slavonic", "date": "2019-03-27 13:07:14.911308"}'
           b'{"word": "peculatation", "date": "2019-03-27 13:07:15.421915"}'

I tried this. Didn't seem to work.

d = pd.DataFrame(dict(data))

2 Answers 2

2

First decode values to utf-8 and convert to dictionaries in list comprehension by ast.literal_eval or json.loads:

data = [b'{"word": "Gondwana", "date": "2019-03-27 13:07:12.404732"}',
        b'{"word": "alalus", "date": "2019-03-27 13:07:12.909517"}',
        b'{"word": "Balto-Slavonic", "date": "2019-03-27 13:07:14.911308"}',
        b'{"word": "peculatation", "date": "2019-03-27 13:07:15.421915"}']

import ast   

df = pd.DataFrame([ast.literal_eval(x.decode("utf-8")) for x in data])
print (df)
                         date            word
0  2019-03-27 13:07:12.404732        Gondwana
1  2019-03-27 13:07:12.909517          alalus
2  2019-03-27 13:07:14.911308  Balto-Slavonic
3  2019-03-27 13:07:15.421915    peculatation

Alternative solution, should be faster in large data:

import json

df = pd.DataFrame([json.loads(x.decode("utf-8")) for x in data])
Sign up to request clarification or add additional context in comments.

2 Comments

I have edited the data a bit. I guess it will still be the same, right? @jezrael
@Jazz - Values are in list? Then solution is simplier.
0

You can't just construct a dictionary with a string of bytes formatted like a python dict. You'll need to parse it somehow.

If you know that your byte string is always going to be a valid dict. You can try

dict(eval(b'{"word": "soning", "date": "2019-03-27 13:07:13.409948"}'))

and you should be ok. If you don't know what will be in the byte string I would advise against the use of eval.

The other answer here advises use of ast.literal_eval this is safer than eval because literal_eval cannot be used to evaluate complex expression. see: https://docs.python.org/3.5/library/ast.html#ast.literal_eval

you can get literal_eval from the ast module


from ast import literal_eval
literal_eval(b'{"word": "soning", "date": "2019-03-27 13:07:13.409948"}')

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.