How to create Multi-column index dataframe & how to plot graphs for each set of values

Question

I want to create a dataframe as below. The picture is from what I have in excel. I do not want to import from excel, rather create it directly in pandas. I think it should be possible by create a multi-index using pd.MultiIndex.from_product but I am not able to figure it out.

I want to create a graph of x as x-axis & y as y-axis for all A, B & C in the same graph. I think this is also possible but not sure how.

You can ignore the values in have in the picture. It can be random values, not a problem. I will manage to figure out entering the values i want later.

Why don't you use a dataframe with three columns x, y and type? Values of the type column could be "A","B" or "C" — Riccardo Bucco
– Riccardo Bucco, Commented Nov 28, 2019 at 13:04
This is how the data is, I am trying to replicate it. If I do as you suggest, will it help question number 2? — moys
– moys, Commented Nov 28, 2019 at 13:12

Chris Adams · Accepted Answer · 2019-11-28 13:34:43Z

Here's how you could create a MultiIndex.from_product. For plotting you'll need to restructure the data slightly - I'm using stack and reset_index here. I'd recommend seaborn.Facetgrid for easy to configure scatter subplots.

import matplotlib.pyplot as plt
import seaborn as sns

# Create MultiIndex from_product
columns = pd.MultiIndex.from_product([['A', 'B', 'C'], ['x', 'y']])

np.random.seed(0)
df = pd.DataFrame(np.random.randn(10, 6), columns=columns)
print(df)

          A                   B                   C          
          x         y         x         y         x         y
0  1.764052  0.400157  0.978738  2.240893  1.867558 -0.977278
1  0.950088 -0.151357 -0.103219  0.410599  0.144044  1.454274
2  0.761038  0.121675  0.443863  0.333674  1.494079 -0.205158
3  0.313068 -0.854096 -2.552990  0.653619  0.864436 -0.742165
4  2.269755 -1.454366  0.045759 -0.187184  1.532779  1.469359
5  0.154947  0.378163 -0.887786 -1.980796 -0.347912  0.156349
6  1.230291  1.202380 -0.387327 -0.302303 -1.048553 -1.420018
7 -1.706270  1.950775 -0.509652 -0.438074 -1.252795  0.777490
8 -1.613898 -0.212740 -0.895467  0.386902 -0.510805 -1.180632
9 -0.028182  0.428332  0.066517  0.302472 -0.634322 -0.362741

# Scatter subplots
g = sns.FacetGrid(df.stack(level=0).reset_index(), row='level_1')
g.map(plt.scatter, 'x', 'y')

Alternatively, if you require one plot with distinction between 'A', 'B' & 'C' you could try:

sns.scatterplot(data=df.stack(level=0).reset_index(), x='x', y='y', hue='level_1')

Riccardo Bucco · Accepted Answer · 2019-11-28 13:25:53Z

1

You may want to create a dataframe with three columns: x, y and t (either "A", "B" or "C"):

import pandas as pd
df = pd.DataFrame({"x": [ 1,   1,   1,   2,   2,   2,   3,   3,   3 ],
                   "y": [ 1,   2,   3,   2,   4,   6,   3,   6,   9 ],
                   "t": ["A", "B", "C", "A", "B", "C", "A", "B", "C"]})

Plotting three different lines is also very easy:

import matplotlib.pyplot as plt

for index, group in df.groupby("t"):
    plt.plot(group["x"], group["y"])
plt.show()

answered Nov 28, 2019 at 13:25

Riccardo Bucco

15.5k4 gold badges29 silver badges57 bronze badges

1 Comment

moys Over a year ago

Thanks. Another approach than what i had in mind & probably will need modification of dataframe that i already have. Nonetheless, I think I was helpful.

Collectives™ on Stack Overflow

How to create Multi-column index dataframe & how to plot graphs for each set of values

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related