3

I'm learning how to use the python xarray package, however, I'm having troubles with multi-dimensional data. Specifically, how to add and use additional coordinates?

Here's an example.

import xarray as xr
import pandas as pd
import numpy as np

site_id = ['brw','sum','mlo']
dss = []
for site in site_id:
    df = pd.DataFrame(np.random.randn(20,2),columns=['a','b'],index=pd.date_range('20160101',periods=20,freq='MS'))
    ds = df.to_xarray()
    dss.append(ds)

ds = xr.concat(dss, dim=pd.Index(site_id, name='site'))
ds.coords['latitude'] = [71.323, 72.58, 19.5362]
ds.coords['longitude'] = [156.6114, 38.48, 155.5763]

My xarray data set looks like:

>>> ds
<xarray.Dataset>
Dimensions:    (index: 20, latitude: 3, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
  * latitude   (latitude) float64 71.32 72.58 19.54
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index) float64 -0.1403 -0.2225 -1.199 -0.8916 0.1149 ...
    b          (site, index) float64 -1.506 0.9106 -0.7359 2.123 -0.1987 ...

I can select a series by using the sel method based on a site code. For example:

>>> ds.sel(site='mlo')

But how do I select data based on the other coordinates (i.e. latitude or longitude)?

>>> ds.sel(latitude>50)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'latitude' is not defined
0

2 Answers 2

4

Thanks for the easy-to-reproduce example!

You can only use .sel(x=y) with =, because of the limitations of python. An example using .isel with latitude (sel is harder because it's a float type):

In [7]: ds.isel(latitude=0)
Out[7]:
<xarray.Dataset>
Dimensions:    (index: 20, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
    latitude   float64 71.32
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index) float64 0.6493 -0.9105 -0.9963 -0.6206 0.6856 ...
    b          (site, index) float64 -0.03405 -1.49 0.2646 -0.3073 0.6326 ...

To use conditions such as >, you can use .where:

In [9]: ds.where(ds.latitude>50, drop=True)
Out[9]:
<xarray.Dataset>
Dimensions:    (index: 20, latitude: 2, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
  * latitude   (latitude) float64 71.32 72.58
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index, latitude) float64 0.6493 0.6493 -0.9105 -0.9105 ...
    b          (site, index, latitude) float64 -0.03405 -0.03405 -1.49 -1.49 ...
Sign up to request clarification or add additional context in comments.

Comments

1

Another solution for selecting data through "sel" method would be using the "slice" object of Python.

So, in order to select data from a Xarray object whose latitude is greater than a given value (i.e. 50 degrees north), one could write the following:

   ds.sel(dict(latitude=slice(50,None)))

I hope it helps.

Sincerely,

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.