2

I have data that looks like this

ID    value_y   date_x      end_cutoff
1      75     2020-7-1      2021-01-17
1      73     2020-7-2      2021-01-17
1      74     2020-7-1      2021-06-05
1      71     2020-7-2      2021-06-05
2      111    2020-7-1      2021-01-17
2      112    2020-7-2      2021-01-17
2      113    2020-7-1      2021-06-05
2      115    2020-7-2      2021-06-05
   

And I want to plot the following data such that the following are met:

  1. Each ID has 1 graph
  2. Each graph has n lines plotted (2 in this example; 1 for each end_cutoff)

So, ideally in this example I would have two separate plots both with two lines.

Currently here is the code I have but it plots them all but on the same plot instead of a new plot for each ID.

 grouped = df_fit.groupby(['ID','end_cutoff'])
 fig, ax = plt.subplots()
 for (ID, end_cutoff), df_fit in grouped:
     ax.plot(df_fit['date_x'], df_fit['value_y'], label=ID+' '+str(end_cutoff.date()))
 plt.show()
0

1 Answer 1

5
  • This solution adds the missing pieces into your existing code
  1. Format the date columns correctly to a datetime dtype, and extract only the date component.
  2. Create a number of subplots equal to the number of unique 'ID' values
  3. Get the index of ID within uid and use that value to index and plot to the correct ax
  • This option uses pandas.DataFrame.plot
  • The format of the x-axis is '%m-%d %H' because the time between points is small. The x-axis will auto format depending on the date range.
import pandas as pd
import numpy as np

# dataframe
data = {'ID': [1, 1, 1, 1, 2, 2, 2, 2], 'value_y': [75, 73, 74, 71, 111, 112, 113, 115], 'date_x': ['2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2'], 'end_cutoff': ['2021-01-17', '2021-01-17', '2021-06-05', '2021-06-05', '2021-01-17', '2021-01-17', '2021-06-05', '2021-06-05']}
df = pd.DataFrame(data)

# set date columns to a datetime dtype and extract only the date component since time isn't relevant
df['end_cutoff'] =  pd.to_datetime(df['end_cutoff']).dt.date
df['date_x'] =  pd.to_datetime(df['date_x']).dt.date

# create grouped
grouped = df.groupby(['ID','end_cutoff'])

# create subplots based on the number of unique ID values
uid = df.ID.unique()
fig, ax = plt.subplots(nrows=len(uid), figsize=(7, 4))

for (ID, end_cutoff), df_fit in grouped:
    
    # get the index of the current ID, and use it to index ax
    axi = np.argwhere(uid==ID)[0][0]

    # plot to the correct ax based on the index of the ID
    df_fit.plot(x='date_x', y='value_y', ax=ax[axi], label=f'{ID} {end_cutoff}',
                xlabel='Date', ylabel='Value', title=f'ID: {ID}', marker='.', rot=30)

    # place the legend outside the plot
    ax[axi].legend(title='Cutoff', bbox_to_anchor=(1.05, 1), loc='upper left')

plt.tight_layout()
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.