0

I am trying to come up with a script that will allow me to read all csv files with greater than 62 bits and print two columns into a separate excel file and create a list.

The following is one of the csv files:

FileUUID        Table   RowInJSON       JSONVariable    Error   Notes   SQLExecuted
ff3ca629-2e9c-45f7-85f1-a3dfc637dd81    lng02_rpt_b_calvedets   1               Duplicate entry 'ETH0007805440544' for key 'nosameanimalid'             INSERT INTO lng02_rpt_b_calvedets(farmermobile,hh_id,rpt_b_calvedets_rowid,damidyesno,damid,calfdam_id,damtagid,calvdatealv,calvtype,calvtypeoth,easecalv,easecalvoth,birthtyp,sex,siretype,aiprov,othaiprov,strawidyesno,strawid)  VALUES ('0974502779','1','1','0','ETH0007805440544','ETH0007805470547',NULL,'2017-09-16','1',NULL,'1',NULL,'1','2','1',NULL,NULL,NULL,NULL,NULL,'0',NULL,NULL,NULL,NULL,NULL,NULL,'0',NULL,'Tv',NULL,NULL,'Et','23',NULL,'5',NULL,NULL,NULL,'0','0')

This is my attempt to solving this problem:

path = 'csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):
    output = infile + '.out'
    with open(infile, 'r') as source:
        readr = csv.reader(source)
        with open(output,"w") as result:
            writr = csv.writer(result)
            for r in readr:
                writr.writerow((r[4], r[2]))

Please help point me to the right direction with any alternative solution

1

2 Answers 2

1

pandas does a lot of what you are trying to achieve:

import pandas as pd

# Read a csv file to a dataframe
df = pd.read_csv("<path-to-csv>")

# Filter two columns
columns = ["FileUUID", "Table"]
df = df[columns]

# Combine multiple dataframes
df_combined = pd.concat([df1, df2, df3, ...])

# Output dataframe to excel file
df_combined.to_excel("<output-path>", index=False)

To loop through all csv files > 62bits, you can use glob.glob() and os.stat()

import os
import glob

dataframes = []

for csvfile in glob.glob("<csv-folder-path>/*.csv"):
  if os.stat(csvfile).st_size > 62:
    dataframes.append(pd.read_csv(csvfile))
Sign up to request clarification or add additional context in comments.

Comments

1

Use the standard csv module. Don't re-invent the wheel.

https://docs.python.org/3/library/csv.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.