1
  1. I want to create a dataframe as below. The picture is from what I have in excel. I do not want to import from excel, rather create it directly in pandas. I think it should be possible by create a multi-index using pd.MultiIndex.from_product but I am not able to figure it out.

enter image description here

  1. I want to create a graph of x as x-axis & y as y-axis for all A, B & C in the same graph. I think this is also possible but not sure how.

You can ignore the values in have in the picture. It can be random values, not a problem. I will manage to figure out entering the values i want later.

2
  • Why don't you use a dataframe with three columns x, y and type? Values of the type column could be "A","B" or "C" Commented Nov 28, 2019 at 13:04
  • This is how the data is, I am trying to replicate it. If I do as you suggest, will it help question number 2? Commented Nov 28, 2019 at 13:12

2 Answers 2

3

Here's how you could create a MultiIndex.from_product. For plotting you'll need to restructure the data slightly - I'm using stack and reset_index here. I'd recommend seaborn.Facetgrid for easy to configure scatter subplots.

import matplotlib.pyplot as plt
import seaborn as sns

# Create MultiIndex from_product
columns = pd.MultiIndex.from_product([['A', 'B', 'C'], ['x', 'y']])

np.random.seed(0)
df = pd.DataFrame(np.random.randn(10, 6), columns=columns)
print(df)

          A                   B                   C          
          x         y         x         y         x         y
0  1.764052  0.400157  0.978738  2.240893  1.867558 -0.977278
1  0.950088 -0.151357 -0.103219  0.410599  0.144044  1.454274
2  0.761038  0.121675  0.443863  0.333674  1.494079 -0.205158
3  0.313068 -0.854096 -2.552990  0.653619  0.864436 -0.742165
4  2.269755 -1.454366  0.045759 -0.187184  1.532779  1.469359
5  0.154947  0.378163 -0.887786 -1.980796 -0.347912  0.156349
6  1.230291  1.202380 -0.387327 -0.302303 -1.048553 -1.420018
7 -1.706270  1.950775 -0.509652 -0.438074 -1.252795  0.777490
8 -1.613898 -0.212740 -0.895467  0.386902 -0.510805 -1.180632
9 -0.028182  0.428332  0.066517  0.302472 -0.634322 -0.362741

# Scatter subplots
g = sns.FacetGrid(df.stack(level=0).reset_index(), row='level_1')
g.map(plt.scatter, 'x', 'y')

enter image description here

Alternatively, if you require one plot with distinction between 'A', 'B' & 'C' you could try:

sns.scatterplot(data=df.stack(level=0).reset_index(), x='x', y='y', hue='level_1')

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

You may want to create a dataframe with three columns: x, y and t (either "A", "B" or "C"):

import pandas as pd
df = pd.DataFrame({"x": [ 1,   1,   1,   2,   2,   2,   3,   3,   3 ],
                   "y": [ 1,   2,   3,   2,   4,   6,   3,   6,   9 ],
                   "t": ["A", "B", "C", "A", "B", "C", "A", "B", "C"]})

Plotting three different lines is also very easy:

import matplotlib.pyplot as plt

for index, group in df.groupby("t"):
    plt.plot(group["x"], group["y"])
plt.show()

1 Comment

Thanks. Another approach than what i had in mind & probably will need modification of dataframe that i already have. Nonetheless, I think I was helpful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.