Printing columns from a CSV file into an excel file with python

Question

I am trying to come up with a script that will allow me to read all csv files with greater than 62 bits and print two columns into a separate excel file and create a list.

The following is one of the csv files:

FileUUID        Table   RowInJSON       JSONVariable    Error   Notes   SQLExecuted
ff3ca629-2e9c-45f7-85f1-a3dfc637dd81    lng02_rpt_b_calvedets   1               Duplicate entry 'ETH0007805440544' for key 'nosameanimalid'             INSERT INTO lng02_rpt_b_calvedets(farmermobile,hh_id,rpt_b_calvedets_rowid,damidyesno,damid,calfdam_id,damtagid,calvdatealv,calvtype,calvtypeoth,easecalv,easecalvoth,birthtyp,sex,siretype,aiprov,othaiprov,strawidyesno,strawid)  VALUES ('0974502779','1','1','0','ETH0007805440544','ETH0007805470547',NULL,'2017-09-16','1',NULL,'1',NULL,'1','2','1',NULL,NULL,NULL,NULL,NULL,'0',NULL,NULL,NULL,NULL,NULL,NULL,'0',NULL,'Tv',NULL,NULL,'Et','23',NULL,'5',NULL,NULL,NULL,'0','0')

This is my attempt to solving this problem:

path = 'csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):
    output = infile + '.out'
    with open(infile, 'r') as source:
        readr = csv.reader(source)
        with open(output,"w") as result:
            writr = csv.writer(result)
            for r in readr:
                writr.writerow((r[4], r[2]))

Please help point me to the right direction with any alternative solution

You can use openpyxl to write spreadsheets. openpyxl docs openpyxl tutorial — Konstantinos K
– Konstantinos K, Commented Oct 28, 2019 at 10:55

Peter · Accepted Answer · 2019-10-28 11:29:20Z

1

pandas does a lot of what you are trying to achieve:

import pandas as pd

# Read a csv file to a dataframe
df = pd.read_csv("<path-to-csv>")

# Filter two columns
columns = ["FileUUID", "Table"]
df = df[columns]

# Combine multiple dataframes
df_combined = pd.concat([df1, df2, df3, ...])

# Output dataframe to excel file
df_combined.to_excel("<output-path>", index=False)

To loop through all csv files > 62bits, you can use glob.glob() and os.stat()

import os
import glob

dataframes = []

for csvfile in glob.glob("<csv-folder-path>/*.csv"):
  if os.stat(csvfile).st_size > 62:
    dataframes.append(pd.read_csv(csvfile))

answered Oct 28, 2019 at 11:29

Peter

6787 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

shrewmouse · Accepted Answer · 2019-10-28 11:48:19Z

1

Use the standard csv module. Don't re-invent the wheel.

https://docs.python.org/3/library/csv.html

answered Oct 28, 2019 at 11:48

shrewmouse

6,1403 gold badges42 silver badges48 bronze badges

Collectives™ on Stack Overflow

Printing columns from a CSV file into an excel file with python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related