3

I have a list of numpy arrays, each potentially having a different number of elements, such as:

[array([55]),
 array([54]),
 array([], dtype=float64),
 array([48, 55]),]

I would like to plot this, where each array has an abscissa (x value) assigned, such as [1,2,3,4] so that the plot should show the following points: [[1,55], [2, 54], [4, 48], [4, 55]]. Is there a way I can do that with matplotlib? or how can I transform the data with numpy or pandas first so that it is can be plotted?

5 Answers 5

5

What you want to do is chain the original array and generate a new array with "abscissas". There are many way to concatenated, one of the most efficient is using itertools.chain.

import itertools
from numpy import array

x = [array([55]), array([54]), array([]), array([48, 55])]

ys = list(itertools.chain(*x))
# this will be [55, 54, 48, 55]

# generate abscissas
xs = list(itertools.chain(*[[i+1]*len(x1) for i, x1 in enumerate(x)])) 

Now you can just plot easily with matplotlib as below

import matplotlib.pyplot as plt
plt.plot(xs, ys)
Sign up to request clarification or add additional context in comments.

3 Comments

Not really. I need the last two elements, 48 and 55 to be associated to the same abcissa (the same x) in that case 4.
If you wan to plot an individual line, I've changed the code above. Not quite sure what you're trying to plot but the above will do
This works, but to get xs = [1, 2, 4, 4] as requested, you need to use [[i+1]*len(x1) for i, x1 in enumerate(x)]. (currently, it gives [0, 1, 3, 3])
1

If you want to have different markers for different groups of data (the colours are automatically cycled by matplotlib):

import numpy as np
import matplotlib.pyplot as plt

markers = ['o', #'circle',
           'v', #'triangle_down',
           '^', #'triangle_up',
           '<', #'triangle_left',
           '>', #'triangle_right',
           '1', #'tri_down',
           '2', #'tri_up',
           '3', #'tri_left',
           '4', #'tri_right',
           '8', #'octagon',
           's', #'square',
           'p', #'pentagon',
           'h', #'hexagon1',
           'H', #'hexagon2',
           'D', #'diamond',
           'd', #'thin_diamond'
           ]

n_markers = len(markers)

a = [10.*np.random.random(int(np.random.random()*10)) for i in xrange(n_markers)]

fig = plt.figure()
ax = fig.add_subplot(111)
for i, data in enumerate(a):
    xs = data.shape[0]*[i,]  # makes the abscissas list
    marker = markers[i % n_markers] # picks a valid marker
    ax.plot(xs, data, marker, label='data %d, %s'%(i, marker))

ax.set_xlim(-1, 1.4*len(a))
ax.set_ylim(0, 10)
ax.legend(loc=None)
fig.tight_layout()

Graph with different markers Notice the limits to y scale are hard coded, change accordingly. The 1.4*len(a) is meant to leave room on the right side of the graph for the legend.

The example above has no point in the x=0 (would be dark blue circles) as the randomly assigned size for its data set was zero, but you can easily place a +1 if you don't want to use x=0.

Comments

1

Using pandas to create a numpy array with nans inserted when an array is empty or shorter than the longest array in the list...

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

arr_list = [np.array([55]),
            np.array([54]),
            np.array([], dtype='float64'),
            np.array([48, 55]),]

df = pd.DataFrame(arr_list)
list_len = len(df)
repeats = len(list(df))
vals = df.values.flatten()
xax = np.repeat(np.arange(list_len) + 1, repeats)
df_plot = pd.DataFrame({'xax': xax, 'vals': vals})
plt.scatter(df_plot.xax, df_plot.vals);

Comments

0

with x your list :

[plt.plot(np.repeat(i,len(x[i])), x[i],'.') for i in range(len(x))]
plt.show()

3 Comments

Its bad practice to call a list list
you're right, it was just for the sake of the example :)
Thanks, but my actual list is very large, and I don't think it will be efficient to call plot so many times.
0

@Alessandro Mariani's answer based on itertools made me think of another way to generate an array containg the data I needed. In some cases it may be more compact. It is also based on itertools.chain:

import itertools
from numpy import array

y = [array([55]), array([54]), array([]), array([48, 55])]
x = array([1,2,3,4])
d = array(list(itertools.chain(*[itertools.product([t], n) for t, n in zip(x,y)])))

d is now the following array:

array([[ 1, 55],
       [ 2, 54],
       [ 4, 48],
       [ 4, 55]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.