Create empty csv file with pandas

Question

I am interacting through a number of csv files and want to append the mean temperatures to a blank csv file. How do you create an empty csv file with pandas?

for EachMonth in MonthsInAnalysis:
    TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
    MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
    with open('my_csv.csv', 'a') as f:
        df.to_csv(f, header=False)

So in the above code how do I create the my_csv.csv prior to the for loop?

Just a note I know you can create a data frame then save the data frame to csv but I am interested in whether you can skip this step.

In terms of context I have the following csv files:

Each of which have the following structure:

The Day column reads up to 30 days for each file.

I would like to output a csv file that looks like this:

But obviously includes all the days for all the months.

My issue is that I don't know which months are included in each analysis hence I wanted to use a for loop that used a list that has that information in it to access the relevant csvs, calculate the mean temperature then save it all into one csv.

Input as text:

    Unnamed: 0  AirTemperature  AirHumidity SoilTemperature SoilMoisture    LightIntensity  WindSpeed   Year    Month   Day Hour    Minute  Second  TimeStamp   MonthCategorical    TimeOfDay
6   6   18  84  17  41  40  4   2016    1   1   6   1   1   10106   January Day
7   7   20  88  22  92  31  0   2016    1   1   7   1   1   10107   January Day
8   8   23  1   22  59  3   0   2016    1   1   8   1   1   10108   January Day
9   9   23  3   22  72  41  4   2016    1   1   9   1   1   10109   January Day
10  10  24  63  23  83  85  0   2016    1   1   10  1   1   10110   January Day
11  11  29  73  27  50  1   4   2016    1   1   11  1   1   10111   January Day

why do you need to create it first? surely creating from scratch at save time is equivalent to append to an already existing, empty csv? — Chris
– Chris, Commented Mar 10, 2016 at 12:34
Because I don't know which csv's are present before the grouping occurs so I figure it is easier to create first and fill with whatever is present. How would you approach this? — PaulBarr
– PaulBarr, Commented Mar 10, 2016 at 12:38
So you want to overwrite 'my_csv.csv' file len(MonthsInAnalysis) times - is that what you want? ;-) — MaxU - stand with Ukraine
– MaxU - stand with Ukraine, Commented Mar 10, 2016 at 12:54
Well not overwrite, the for loop will run len(MonthsInAnalysis) times and each time I get a new groupby object I want to append it to the csv. I thought thats what the with open part achieved. — PaulBarr
– PaulBarr, Commented Mar 10, 2016 at 12:56
@PaulBarr, I guess it would be easier to help you if you would explain bit more - what is your source data and what do you want to achieve (i.e. how the output should look like). There might be another more elegant solution where you won't need to make any loops... — MaxU - stand with Ukraine
– MaxU - stand with Ukraine, Commented Mar 10, 2016 at 13:02

Stop harming Monica · Accepted Answer · 2016-03-10 13:17:22Z

5

Just open the file in write mode to create it.

with open('my_csv.csv', 'w'):
    pass

Anyway I do not think you should be opening and closing the file so many times. You'd better open the file once, write several times.

with open('my_csv.csv', 'w') as f:
    for EachMonth in MonthsInAnalysis:
        TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
        MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
        df.to_csv(f, header=False)

edited Mar 10, 2016 at 13:17

answered Mar 10, 2016 at 13:10

Stop harming Monica

12.7k1 gold badge40 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

PaulBarr Over a year ago

Thank you, this makes a lot more sense that what I was doing. I will accept in a few minutes.

MaxU - stand with Ukraine Over a year ago

this will overwrite CSV file len(MonthsInAnalysis) times

Stop harming Monica Over a year ago

@MaxU no it won't.

MaxU - stand with Ukraine Over a year ago

@Goyo, OK run the following test: [pd.DataFrame(np.random.randn(4, 4)).to_csv('out.csv') for i in range(5)] and tell us how many rows do you have in the out.csv at the end! Following your logic there must be 5*4 = 20 rows in the CSV file. Please test

Stop harming Monica Over a year ago

@MaxU That has nothing to do with my suggestion. It's more like [pd.DataFrame(np.random.randn(4, 4)).to_csv(f) for i in range(5)] where fis a writeable file object, not a file name.

|

Shinto Joseph · Accepted Answer · 2020-04-12 05:08:46Z

3

Creating a blank csv file is as simple as this one

import pandas as pd

pd.DataFrame({}).to_csv("filename.csv")

answered Apr 12, 2020 at 5:08

Shinto Joseph

3,17131 silver badges25 bronze badges

Comments

MaxU - stand with Ukraine · Accepted Answer · 2016-03-10 13:40:04Z

1

I would do it this way: first read up all your CSV files (but only the columns that you really need) into one DF, then make groupby(['Year','Month','Day']).mean() and save resulting DF into CSV file:

import glob
import pandas as pd

fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Year','Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Year','Month','Day']).mean().to_csv('my_csv.csv')

and if want to ignore the year:

import glob
import pandas as pd

fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Month','Day']).mean().to_csv('my_csv.csv')

Some details:

(pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob('*.csv'))

will generate tuple of data frames from all your CSV files

pd.concat(...)

will concatenate them into resulting single DF

df.groupby(['Year','Month','Day']).mean()

will produce wanted report as a data frame, which might be saved into new CSV file:

.to_csv('my_csv.csv')

edited Mar 10, 2016 at 13:40

answered Mar 10, 2016 at 13:26

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

3 Comments

PaulBarr Over a year ago

The csv's are in a subdirectory MonthlyDataSplit/Day I don't quite understand in this example how I would direct it. Would i use glob.glob('MonthlyDataSplit/Day/*.csv')?

PaulBarr Over a year ago

Thank you I think this approach is very clean and also more flexible. I appreciate your help

MaxU - stand with Ukraine Over a year ago

I'm happy to help. Please next time asking 'Pandas' questions post sample input and desired output (as text) - it helps to better understand what OP wants and also helps to develop a solution. :)

Chris · Accepted Answer · 2016-03-10 13:12:46Z

0

The problem is a little unclear, but assuming you have to iterate month by month, and apply the groupby as stated just use:

 #Before loops
 dflist=[]

Then in each loop do something like:

 dflist.append(MeanDailyTemperaturesForCurrentMonth)

Then at the end:

 final_df = pd.concat([dflist], axis=1)

and this will join everything into one dataframe.

Look at:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html

http://pandas.pydata.org/pandas-docs/stable/merging.html

edited Mar 10, 2016 at 13:12

answered Mar 10, 2016 at 13:06

Chris

9675 silver badges10 bronze badges

1 Comment

MaxU - stand with Ukraine Over a year ago

IMO doing pd.concat() in loop is not the best idea - you may want to collect data frames into list and concatenate them in one short, of course if they are not huge.

JazzyJ · Accepted Answer · 2022-12-06 23:05:46Z

0

You could do this to create an empty CSV and add columns without an index column as well.

import pandas as pd
df=pd.DataFrame(columns=["Col1","Col2","Col3"]).to_csv(filename.csv,index=False)

answered Dec 6, 2022 at 23:05

JazzyJ

3411 gold badge2 silver badges9 bronze badges

Collectives™ on Stack Overflow

Create empty csv file with pandas

5 Answers 5

7 Comments

Comments

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

7 Comments

Comments

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related