2

Trying to learn Python and managed to create a script that takes a csv, turns it into a data frame, changes the columns and then outputs a csv to the desired style.

Well what I need to do now is to be able to output multiple csvs based on the contents of my second column (first is the index, which I remove for output)

I have set up a parameter for unique data values and then a FOR loop to create filenames and output paths based on the unique data value.

But when I output the csv (data.to_csv), all 4 files are the same and unfiltered.

Here is my code

unique_code = data.import_code.unique() 
for importcode in unique_import_codes:     
    #print("%s" % importcode)             
    filename = importcode.replace(".","") + ".csv"   
    #print("%s" % filename)                
    path = r"C:/myrequiredpath/"     
    #print("%s" % path)                    
    data.to_csv(path+filename, index=False)

my data frame is called data import_code is my second column (not an index)

any ideas welcome!

4 Answers 4

2

I'd do it this way:

filename =  r"C:/myrequiredpath/{}.csv"

data.groupby('import_code') \
    .apply(lambda g: g.to_csv(filename.format(g.name), index=False))
Sign up to request clarification or add additional context in comments.

Comments

0

You can use loc to filter data:

unique_code = data.import_code.unique() 
for importcode in unique_import_codes:
    filename = importcode.replace(".","") + ".csv"
    path = r"C:/myrequiredpath/"
    data.loc[data.import_code==importcode].to_csv(path+filename, index=False)

Comments

0

Nowhere in your loop do you do anything to select a subset of data. The last line

data.to_csv(path+filename, index=False)

Just writes out the unchanged data frame with a different filename each time.

Comments

0

If your goal is to just export the file, each being a copy of the original dataframe with the unique values as a name, then I would go this route.

unique_values = set(data['column_of_interest'])

for value in unique_values:
    filename = value + ".csv"         
    path = r"C:/myrequiredpath/"                        
    data.to_csv(path+filename, index=False)

If you want each file to be an export of that subset of the data, then add this to your loop:

data[data['column of interest']==value]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.