0

I am trying to convert NetCDF to .csv much like this post. I am using a netCDF file with similar variables: 'time', 'lat', 'lon', 'total'

I've reproduced the top answer's code:

import netCDF4
import pandas as pd

file = 'file_path'
nc = netCDF4.Dataset(file, mode='r')

nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
total = nc.variables['total'][:]

total_ts = pd.Series(total, index=dtime) 
total_ts.to_csv('total.csv',index=True, header=True)

however I am getting 2 errors:

UserWarning: WARNING: valid_range not used since it cannot be safely cast to variable data type
dtime = netCDF4.num2date(time_var[:],time_var.units)

and

total_ts = pd.Series(total,index=dtime)
Exception: Data must be 1-dimensional

I am not sure what went wrong since the code is exactly the same and the netCDF file is very similar.

6
  • Can you give us the output of ncdump -h file_path? Commented Oct 8, 2018 at 7:36
  • Hi, the full output is too long for a comment but I can post the important details: dimensions: time = 1 ; lat = 29 ; lon = 18 ; variables: float total(time, lat, lon), double lat(lat), double lon(lon), int time(time) Let me know if you need more info @msi_gerva, thanks! Commented Oct 13, 2018 at 17:24
  • I was most interested in the time variable and the units of time. In any case, to me it seems strange to use integer as a type for time. I would expect double kind of variable here... The second error is also clear now - Pandas is expecting 1D array for total_ts, but you are giving it a 3D array with dimensions (time,lat,lon). You could get rid of the second error by total = total.flatten() provided that the total is NumPy array. Commented Oct 13, 2018 at 17:55
  • As requested, along with other details: time:units = "minutes since 2016-01-01 00:30:00" ; time:time_increment = 60000 ; time:begin_date = 20160101 ; time:begin_time = 3000 ; Thanks for explaining! You were right that the exception went away when I flattened it, unfortunately it got replaced with a ValueError: Length of passed values is 522, index implies 1 Commented Oct 13, 2018 at 18:42
  • I guess the last error is because if you flatten your data, the length of the time data and the total data does not match - one has number of values as the length of time dimension and other as the product of time, lon and lat dimensions. Anyhow, I am not sure what is your aim with the data. For me it does not make sense to convert (time,lat,lon) data to one Pandas table and I would rather work with the NumPy array with (time,lat,lon) dimensions. If you are to use just one timeserie, then it makes sense to use Pandas table for it. Commented Oct 14, 2018 at 17:09

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.