2

Im looking for a pythonic way to plot an array whereby the x-axis values fluctuate. All of my data is read in from a h5 file (several rows and columns) such that:

data = h5py.File("file",'r')
dset = data['/DATA/DATA/'][:]

I now have an array with shape (25, 432). 432 data points in 23 separate data entries. The data are spectral reflectance data collected across 432 spectral bands, taken at 23 points.

I create an x-axis for the data based using the band numbers using:

xvals = numpy.arange(1, 433, 1)

I can plot all of this data by calling each one separately. Its not very elegant but its does the job, such that:

plt.plot(xvals, values[0])
plt.plot(xvals, values[1])
plt.plot(xvals, values[2])

However, the data contains some erroneous values. I can omit them by slicing the arrays, such as:

values = dset[np.logical_and(dset<1, dset>0)]

But this changes the length of each array so that each one may be a unique length. I can slice the x-axis in a similar way but this won't be applicable to all of the arrays.

Essentially I have:

a = [[1,2,3,-1,5,6],
     [1,-1,3,4,5,6],
     [1,2,3,4,-1,6]]

x = [1,2,3,4,5,6]

If I remove the -1 values, x is too long and I cannot plot them as they "do not have the same first dimension". However, x will be different for each array within a.

Is there a pythonic way of plotting the data all in one plot, whereby I can omit erroneous data (outside the range 0-1) from the 'dset' and adjust the number of data points in the x-axis accordingly?

1
  • What happens if you simply replace the erroneous values with np.nans? Also, it seems that Pandas might make your life easier here. If you put all the data in a single DataFrame, you can use a single call to applymap() to replace erroneous values with nans. And, I think a single call of plot() might be able to plot your data as desired. Commented Feb 18, 2016 at 4:53

1 Answer 1

4

This sounds like an excellent case for numpy's masked arrays!

In your case, you could use it like this:

import matplotlib.pyplot as plt
import numpy as np
plt.ion()
a = np.array([[1,2,3,-1,5,6],
     [1,-1,3,4,5,6],
     [1,2,3,4,-1,6]])
x = np.array([1,2,3,4,5,6])

A = np.ma.array(a, mask=a<0)

plt.plot(x, A.T)

Just for visualisation purposes, I've plotted it by using offsets, which shows the effect of masking values. Note that your lines will be obviously discontinuous, but what's even more: if there's only one datapoint surrounded by masked values, that one datapoint won't be plotted with a line, because a line needs at least two points. Make sure to use markers then, as I've shown.

plt.plot(x, A.T + np.array([[0,1,2]]), marker="o")

Results in:

plotting masked arrays

Sign up to request clarification or add additional context in comments.

1 Comment

worked brilliantly. I didn't even know of masked arrays - something new to read up on!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.