How to convert bytes data into a python pandas dataframe?

Question

I would like to convert 'bytes' data into a Pandas dataframe.

The data looks like this (few first lines):

    (b'#Settlement Date,Settlement Period,CCGT,OIL,COAL,NUCLEAR,WIND,PS,NPSHYD,OCGT'
 b',OTHER,INTFR,INTIRL,INTNED,INTEW,BIOMASS\n2017-01-01,1,7727,0,3815,7404,3'
 b'923,0,944,0,2123,948,296,856,238,\n2017-01-01,2,8338,0,3815,7403,3658,16,'
 b'909,0,2124,998,298,874,288,\n2017-01-01,3,7927,0,3801,7408,3925,0,864,0,2'
 b'122,998,298,816,286,\n2017-01-01,4,6996,0,3803,7407,4393,0,863,0,2122,998'

The columns headers appear at the top. each subsequent line is a timestamp and numbers.

Is there a straightforward way to do this?

Thank you very much

@Paula Livingstone:

This seems to work:

s=str(bytes_data,'utf-8')

file = open("data.txt","w") 

file.write(s)
df=pd.read_csv('data.txt')

maybe this can be done without using a file in between.

KenHBS · Accepted Answer · 2020-11-27 17:08:35Z

64

You can also use BytesIO directly:

from io import BytesIO

df = pd.read_csv(BytesIO(bytes_data))

This will save you the step of transforming bytes_data to a string

answered Nov 27, 2020 at 17:08

KenHBS

7,2546 gold badges41 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tim · Accepted Answer · 2018-04-06 09:34:38Z

62

I had the same issue and found this library https://docs.python.org/2/library/stringio.html from the answer here: How to create a Pandas DataFrame from a string

Try something like:

from io import StringIO

s=str(bytes_data,'utf-8')

data = StringIO(s) 

df=pd.read_csv(data)

answered Apr 6, 2018 at 9:34

Tim

7366 silver badges5 bronze badges

1 Comment

abhiieor Over a year ago

if in case you are getting bytes from subprocess module then s = subprocess.check_output(['docker', 'images']) s1=str(s,'utf-8') data = pd.read_fwf(StringIO(s1)) could help better

Paula Livingstone · Accepted Answer · 2017-11-19 21:17:03Z

1

Ok cool, your input formatting is quite awkward but the following works:

with open('file.txt', 'r') as myfile:
    data=myfile.read().replace('\n', '') #read in file as a string

df = pd.Series(" ".join(data.strip(' b\'').strip('\'').split('\' b\'')).split('\\n')).str.split(',', expand=True)

print(df)

this produces the following:

                 0                  1     2    3     4        5      6   7   \
0  #Settlement Date  Settlement Period  CCGT  OIL  COAL  NUCLEAR   WIND  PS   
1        2017-01-01                  1  7727    0  3815     7404   3923   0   
2        2017-01-01                  2  8338    0  3815     7403   3658  16   
3        2017-01-01                  3  7927    0  3801     7408   3925   0   

       8      9      10     11      12      13     14       15  
0  NPSHYD  OCGT   OTHER  INTFR  INTIRL  INTNED  INTEW  BIOMASS  
1     944      0   2123    948     296     856    238           
2     909      0   2124    998     298     874    288           
3     864      0   2122    998     298     816    286     None

In order for this to work you will need to ensure that your input file contains only a collection of complete rows. For this reason I removed the partial row for the purposes of the test.

As you have said that the data source is an http GET request then the initial read would take place using pandas.read_html.

More detail on this can be found here. Note specifically the section on io (io : str or file-like).

edited Nov 19, 2017 at 21:17

answered Nov 19, 2017 at 19:25

Paula Livingstone

1,2251 gold badge13 silver badges22 bronze badges

2 Comments

user7188934 Over a year ago

Thank you. My input is not from a file though. I created the file as an intermediate step but I would like to avoid using a file at all.

user7188934 Over a year ago

queried via an API from an HTTP request, and i get it in the bytes format shown in the question

Collectives™ on Stack Overflow

How to convert bytes data into a python pandas dataframe?

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related