1

I have a variable that contains a string of:

fruit_wanted = 'banana,apple'

I also have a csv file

fruit,'orange','grape','banana','mango','apple','strawberry'
number,1,2,3,4,5,6
value,3,2,2,4,2,1
price,3,2,1,2,3,4

Now how do I delete the column in which the 'fruit' does not listed in the 'fruit_wanted' variable?

So that the outfile would look like

fruit,'banana','apple'
number,3,5
value,2,2
price,1,3

Thank you.

3
  • 1
    You should have either googled or searched Stackoverflow before posting. Other question with proper answer Commented Nov 28, 2012 at 21:43
  • Your csv file is sideways. This would be trivial if your csv had headers on first line fruit,number,value,price and then each line represented one fruit. Commented Nov 28, 2012 at 21:43
  • @StevenRumbalski: He may not have any control over that. It's useful to know how to deal with sideways CSV files (without having to read the whole thing in so you can zip it to transpose). Commented Nov 28, 2012 at 21:45

2 Answers 2

8

Read the csv file using the DictReader() class, and ignore the columns you don't want:

fruit_wanted = ['fruit'] + ["'%s'" % f for f in fruit_wanted.split(',')]
outfile = csv.DictWriter(open(outputfile, 'wb'), fieldnames=fruit_wanted)
fruit_wanted = set(fruit_wanted)

for row in csv.DictReader(open(inputfile, 'rb')):
    row = {k: row[k] for k in row if k in fruit_wanted}
    outfile.writerow(row)
Sign up to request clarification or add additional context in comments.

4 Comments

+1, except that you probably want to use a csv.DictWriter or csv.Writer rather than print row, or your output will be a dict's str representation rather than a comma-separated list in the right order…
actually, author asked just for the 'outfile')
@alexvassel: 'outfile', I see now. So I updated the answer (and included a correction, the first fruit column is also needed).
This only worked for me after I replaced fields=fruit_wanted with fieldnames=fruit_wanted
0

Here's some pseudocode:

open the original CSV for input, and the new one for output
read the first row of the original CSV and figure out which columns you want to delete
write the modified first row to the output CSV
for each row in the input CSV:
    delete the columns you figured out before
    write the modified row to the output CSV

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.