Get header:row_data from CSV file python

Question

I have a csv file like below

h1,h2,h3,h4
a,b,,d
1,2,3,4
a1,,h5,jj

I'd like to get a list like this: For example, for 'a', I need h1:a,h2:b,h4:d. I could get headers and row data separately, however, I'm unable to concatenate them in the desired way. Also, I don't want blanks to be printed as 'nan'

@reptilicus Yes. But it should print for only one element of h1, i.e., a or 1 or a1. I tried {rows[0]:rows[1] for rows in reader} for the complete dictionary but the output looked horrible. — abn
– abn, Commented Dec 9, 2014 at 21:31
It gives 'nan' for blank cells. I'd like them to be ignored completely. @reptilicus — abn
– abn, Commented Dec 9, 2014 at 21:37

reptilicus · Accepted Answer · 2014-12-09 22:09:48Z

1

Something like this might work

import numpy as np
import pandas
df = pandas.read_csv('some_file')
for row in df.to_dict('records'):
   print {k:v for k,v in row.iteritems() if v is not np.nan}

answered Dec 9, 2014 at 22:09

reptilicus

10.4k6 gold badges59 silver badges80 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rmarques · Accepted Answer · 2014-12-09 22:21:25Z

You can easily do this with the csv module and dict comprehensions:

import csv

with open('test.csv', 'r') as f:                                                                                                                                  
        reader = csv.reader(f)                                                                                                                                        
        result = []                                                                                                                                                   
        header = reader.next()                                                                                                                                        
        for row in reader:                                                                                                                                            
            result.append({k: v for k, v in zip(header, row) if v != ''})

chfw · Accepted Answer · 2014-12-10 14:52:24Z

1

You also can use my wrapper library over csv module to do it:

>>> import pyexcel as pe
>>> s=pe.load("example.csv", name_columns_by_row=0)
>>> records = s.to_records()
>>> records
[{'h2': u'b', 'h3': u'', 'h1': u'a', 'h4': u'd'}, {'h2': u'2', 'h3': u'3', 'h1': u'1', 'h4': u'4'}, {'h2': u'', 'h3': u'h5', 'h1': u'a1', 'h4': u'jj'}]
>>> s.column['h1']
[u'a', u'1', u'a1']
>>> zip(s.column['h1'], records)
[(u'a', {'h2': u'b', 'h3': u'', 'h1': u'a', 'h4': u'd'}), (u'1', {'h2': u'2', 'h3': u'3', 'h1': u'1', 'h4': u'4'}), (u'a1', {'h2': u'', 'h3': u'h5', 'h1': u'a1', 'h4': u'jj'})]

Collectives™ on Stack Overflow

Get header:row_data from CSV file python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related