0

I have a csv file like below

h1,h2,h3,h4
a,b,,d
1,2,3,4
a1,,h5,jj

I'd like to get a list like this: For example, for 'a', I need h1:a,h2:b,h4:d. I could get headers and row data separately, however, I'm unable to concatenate them in the desired way. Also, I don't want blanks to be printed as 'nan'

4
  • You mean a list of dictionaries? Commented Dec 9, 2014 at 21:24
  • @reptilicus Yes. But it should print for only one element of h1, i.e., a or 1 or a1. I tried {rows[0]:rows[1] for rows in reader} for the complete dictionary but the output looked horrible. Commented Dec 9, 2014 at 21:31
  • df.to_dict('records') might work? Commented Dec 9, 2014 at 21:34
  • It gives 'nan' for blank cells. I'd like them to be ignored completely. @reptilicus Commented Dec 9, 2014 at 21:37

3 Answers 3

1

Something like this might work

import numpy as np
import pandas
df = pandas.read_csv('some_file')
for row in df.to_dict('records'):
   print {k:v for k,v in row.iteritems() if v is not np.nan}
Sign up to request clarification or add additional context in comments.

Comments

1

You can easily do this with the csv module and dict comprehensions:

import csv

with open('test.csv', 'r') as f:                                                                                                                                  
        reader = csv.reader(f)                                                                                                                                        
        result = []                                                                                                                                                   
        header = reader.next()                                                                                                                                        
        for row in reader:                                                                                                                                            
            result.append({k: v for k, v in zip(header, row) if v != ''}) 

Comments

1

You also can use my wrapper library over csv module to do it:

>>> import pyexcel as pe
>>> s=pe.load("example.csv", name_columns_by_row=0)
>>> records = s.to_records()
>>> records
[{'h2': u'b', 'h3': u'', 'h1': u'a', 'h4': u'd'}, {'h2': u'2', 'h3': u'3', 'h1': u'1', 'h4': u'4'}, {'h2': u'', 'h3': u'h5', 'h1': u'a1', 'h4': u'jj'}]
>>> s.column['h1']
[u'a', u'1', u'a1']
>>> zip(s.column['h1'], records)
[(u'a', {'h2': u'b', 'h3': u'', 'h1': u'a', 'h4': u'd'}), (u'1', {'h2': u'2', 'h3': u'3', 'h1': u'1', 'h4': u'4'}), (u'a1', {'h2': u'', 'h3': u'h5', 'h1': u'a1', 'h4': u'jj'})]

More documentation can be found here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.