Error with converting NetCDF to csv in python

Ask Question

Asked 7 years, 1 month ago

Modified 7 years, 1 month ago

Viewed 1k times

I am trying to convert NetCDF to .csv much like this post. I am using a netCDF file with similar variables: 'time', 'lat', 'lon', 'total'

I've reproduced the top answer's code:

import netCDF4
import pandas as pd

file = 'file_path'
nc = netCDF4.Dataset(file, mode='r')

nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
total = nc.variables['total'][:]

total_ts = pd.Series(total, index=dtime) 
total_ts.to_csv('total.csv',index=True, header=True)

however I am getting 2 errors:

UserWarning: WARNING: valid_range not used since it cannot be safely cast to variable data type
dtime = netCDF4.num2date(time_var[:],time_var.units)

and

total_ts = pd.Series(total,index=dtime)
Exception: Data must be 1-dimensional

I am not sure what went wrong since the code is exactly the same and the netCDF file is very similar.

edited Oct 6, 2018 at 15:25

asked Oct 6, 2018 at 14:47

plummms

251 silver badge10 bronze badges

Can you give us the output of ncdump -h file_path?

msi_gerva
– msi_gerva

2018-10-08 07:36:38 +00:00
Commented Oct 8, 2018 at 7:36
Hi, the full output is too long for a comment but I can post the important details: dimensions: time = 1 ; lat = 29 ; lon = 18 ; variables: float total(time, lat, lon), double lat(lat), double lon(lon), int time(time) Let me know if you need more info @msi_gerva, thanks!

plummms
– plummms

2018-10-13 17:24:56 +00:00
Commented Oct 13, 2018 at 17:24
I was most interested in the time variable and the units of time. In any case, to me it seems strange to use integer as a type for time. I would expect double kind of variable here... The second error is also clear now - Pandas is expecting 1D array for total_ts, but you are giving it a 3D array with dimensions (time,lat,lon). You could get rid of the second error by total = total.flatten() provided that the total is NumPy array.

msi_gerva
– msi_gerva

2018-10-13 17:55:45 +00:00
Commented Oct 13, 2018 at 17:55
As requested, along with other details: time:units = "minutes since 2016-01-01 00:30:00" ; time:time_increment = 60000 ; time:begin_date = 20160101 ; time:begin_time = 3000 ; Thanks for explaining! You were right that the exception went away when I flattened it, unfortunately it got replaced with a ValueError: Length of passed values is 522, index implies 1

plummms
– plummms

2018-10-13 18:42:43 +00:00
Commented Oct 13, 2018 at 18:42
I guess the last error is because if you flatten your data, the length of the time data and the total data does not match - one has number of values as the length of time dimension and other as the product of time, lon and lat dimensions. Anyhow, I am not sure what is your aim with the data. For me it does not make sense to convert (time,lat,lon) data to one Pandas table and I would rather work with the NumPy array with (time,lat,lon) dimensions. If you are to use just one timeserie, then it makes sense to use Pandas table for it.

msi_gerva
– msi_gerva

2018-10-14 17:09:30 +00:00
Commented Oct 14, 2018 at 17:09

| Show 1 more comment

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Error with converting NetCDF to csv in python

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked