0

I have such a data frame

import pandas as pd

sample_df = pd.DataFrame({'ID': [25,25,25,18,18,18],
                          'AGE': [11,11,12,11,12,13],
                          'RECORD':[1,2,2,1,1,2]})
ID AGE RECORD
25 11 1
25 11 2
25 12 2
18 11 1
18 12 1
18 13 2

I would like to plot number of profiles vs age given this dataframe. My expectation is to have a plot for each age, for example age 11, there should be 3 profiles. Or for age 12, there should be 2 profiles. I tried using df.query, but I ended up confusing. Could you help me?

Expected output should look like below. Legend is not necessary for each IDenter image description here

0

2 Answers 2

1

Using seaborn but transform your dataframe first:

import seaborn as sns
import matplotlib.pyplot as plt

df1 = (sample_df.value_counts(['ID', 'AGE']).to_frame('PROFILE')
                .reset_index().astype(str))

sns.scatterplot(data=df1, x='AGE', y='PROFILE', hue='ID')
# OR
sns.catplot(data=df1.sort_values('PROFILE', ascending=True), x='AGE', y='PROFILE', hue='ID')

plt.show()

enter image description here

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

I have edited my question and showed my expected output. I hope it is clearer now. Thanks for your effort.
Isn't that what you have on the first graph?
no, because you have plotted age VS record columns. What I want is age vs number of records for each ID
0

You can specify column name whose values will be used to color the marker points according to a colormap:

sample_df.groupby(['AGE', 'ID']).count().reset_index()\
    .plot.scatter(x='AGE', y='RECORD', c='ID', colormap='viridis')

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.