How to plot many lines from stacked dataframe column in one plot? [python]

Question

I have a dataframe that looks like this:

                timestamp       Value       Color
--------------------------------------------------
 0    2018-03-04 07:11:08          34         Red
 1    2018-03-04 07:11:09          34         Red
 2    2018-03-04 07:11:10          35         Red
 3    2018-03-04 07:11:12          36         Red
 4    2018-03-04 07:11:14          24         Red
 5    2018-03-04 07:11:15          34         Red
... 
55    2018-03-04 07:12:17          34        Blue
56    2018-03-04 07:12:18          35        Blue
57    2018-03-04 07:12:19          36        Blue
58    2018-03-04 07:12:20          37        Blue
59    2018-03-04 07:12:21          35        Blue
60    2018-03-04 07:12:22          32        Blue

And so over the course of 60 seconds, for each time stamp, there is a value recorded, but the values are split between two colors, Red and Blue. And so, within this dataframe we see time series curves for two different curves occurring at different times, one after the other, and not overlapping. What I want to do is plot them. However, I want to ignore the timestamps, so that it is assumed they start at the same time, and so just treating each color as an array of ordered values, ignoring time skips and assuming equally spaced time intervals. I simply want to plot the Red curve and the Blue curve on the same chart. How can this be done in python? I am trying simply

plt.plot(Blue, Red)

Though I am not sure how to account for the x-axis, which I simply want to be seconds.

Joran Beasley · Accepted Answer · 2022-02-09 06:59:07Z

1

df = pandas.DataFrame({
            'times':list(pandas.date_range('2020-01-01',periods=10,freq='15T')) + 
                    list(pandas.date_range('2020-01-01',periods=10,freq='15T')),
            'colors':['red']*10 + ['blue'] * 10,
            'value': numpy.random.randint(0,255,20)
    })

gives us something like your dataframe

                 times colors  value
0  2020-01-01 00:00:00    red    224
1  2020-01-01 00:15:00    red     47
2  2020-01-01 00:30:00    red     25
3  2020-01-01 00:45:00    red    211
4  2020-01-01 01:00:00    red     18
5  2020-01-01 01:15:00    red    119
6  2020-01-01 01:30:00    red     52
7  2020-01-01 01:45:00    red    246
8  2020-01-01 02:00:00    red     54
9  2020-01-01 02:15:00    red    156
10 2020-01-01 00:00:00   blue     42
11 2020-01-01 00:15:00   blue     55
12 2020-01-01 00:30:00   blue    151
13 2020-01-01 00:45:00   blue    236
14 2020-01-01 01:00:00   blue    207
15 2020-01-01 01:15:00   blue    165
16 2020-01-01 01:30:00   blue    131
17 2020-01-01 01:45:00   blue    199
18 2020-01-01 02:00:00   blue    247
19 2020-01-01 02:15:00   blue     61

we can pivot this using

 df2 = df.pivot(index='times',columns=['colors'],values=['value'])

which gives us

                        value     
colors               blue  red
times                         
2020-01-01 00:00:00    70  225
2020-01-01 00:15:00   162   78
2020-01-01 00:30:00   188   37
2020-01-01 00:45:00   134  234
2020-01-01 01:00:00    46   73
2020-01-01 01:15:00    76   60
2020-01-01 01:30:00   143   61
2020-01-01 01:45:00   150  198
2020-01-01 02:00:00    82  159
2020-01-01 02:15:00   127   94

now we can easily just plot it...

df2.plot()
pyplot.show()

you can drop the value part of the column name with

df2 = df2.droplevel(0,axis=1)
df2.plot()
pyplot.show()

The other option is to just call it individually

BLUE = df[df['colors'] == 'blue']
RED = df[df['colors'] == 'red']
pyplot.plot(BLUE['times'],BLUE['value'])
pyplot.plot(RED['times'],RED['value'])
pyplot.show()

you could use pandas groupby also (dont do this one probably :P )

def plot_it(group,values):
    pyplot.plot(values['times'],values['value'])
df.groupby(['colors']).apply(plot_it)
pyplot.show()

but really the "right" way to handle it is probably the first option (to pivot it to the shape you want)

---- Edit (based on comments) ----

if you dont want the months and to just treat it as a list of y values, just use range as your x

BLUE = df[df['colors'] == 'blue']
RED = df[df['colors'] == 'red']
pyplot.plot(range(len(BLUE)),BLUE['value'])
pyplot.plot(range(len(RED)),RED['value'])
pyplot.show()

edited Feb 9, 2022 at 6:59

answered Feb 9, 2022 at 5:10

Joran Beasley

114k13 gold badges167 silver badges187 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

LostinSpatialAnalysis Over a year ago

This is very helpful, thank you. I tried your first option, and I got the plot, though the curves are not beginning at the same timestamp. This is how the data actually is, since the Blue curve starts after the Red curve ends, but I had wanted to remove the timestamps, so that while the values are still in order, it is as if Red and Blue start at the 00:00:00 timestamp, so just plotting them as arrays. Sorry if my post was not clear on that. I am trying to have the red curve and blue curve start at the same point on the x-axis, regardless of whether one curve goes longer, which it will.

Joran Beasley Over a year ago

dont do the index='times' in the pivot call and i think it will do what you want

LostinSpatialAnalysis Over a year ago

I tried that, and it is actually still recognizing that the curves start at different points, so getting the same plot as before where Blue starts after Red. This figures since when I remove index='times' and remove the timestamp column completely, and then pivot, I see all NaN value for the rows in the Blue column, where the Red column has values, showing me that it is recognizing that Red gets values before Blue does, when I want them to get there values starting at the very first row. Thanks for the suggestion though!

Joran Beasley Over a year ago

ahh yeah i guess that doesnt quite work ... :/ sorry im sure you can adjust this to make it work... but for that it might be easier to use one of the other solutions

LostinSpatialAnalysis Over a year ago

I tried the other options, and still the Blue curve is shown after the Red curve, rather than both starting at 0 on this x-axis. Hmmm, this might be trickier than I had anticipated.

|

Collectives™ on Stack Overflow

How to plot many lines from stacked dataframe column in one plot? [python]

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related