1

I have a pandas dataframe that looks as below:

    Filename    GalCer(18:1/12:0)_IS    GalCer(d18:1/16:0)  GalCer(d18:1/18:0)  

0   A-1-1   15.0    1.299366    40.662458   0.242658    6.891069    0.180315    

1   A-1-2   15.0    1.341638    50.237734   0.270351    8.367316    0.233468    

2   A-1-3   15.0    1.583500    47.039423   0.241681    7.902761    0.201153    

3   A-1-4   15.0    1.635365    53.139610   0.322680    9.578195    0.345681    

4   B-1-10  15.0    2.370330    80.209846   0.463770    13.729810   0.395355

I am trying to plot a scatter sub-plots with a shared x-axis with the first column "Filename" on the x-axis. While I am able to generate barplots, the following code gives me a key error for a scatter plot:

import matplotlib.pyplot as plt
colnames = list (qqq.columns)

qqq.plot.scatter(x=qqq.Filename, y=colnames[1:], legend=False, subplots = True, sharex = True, figsize = (10,50))

KeyError: "['A-1-1' 'A-1-2' 'A-1-3' 'A-1-4' 'B-1-10' ] not in index"

The following code for barplots works fine. Do I need to specify something differently for the scatterplots?

import matplotlib.pyplot as plt
colnames = list (qqq.columns)
qqq.plot(x=qqq.Filename, y=colnames[1:], kind = 'bar', legend=False, subplots = True, sharex = True, figsize = (10,30))
1
  • y = colnames[1:] refers to the list of the column names, not to the data within. Commented Jul 17, 2017 at 15:25

1 Answer 1

4

A scatter plot will require numeric values for both axes. In this case you can use the index as x values,

df.reset_index().plot(x="index", y="other column")

The problem is now that you cannot plot several columns at once using the scatter plot wrapper in pandas. Depending on what the reason for using a scatter plot are, you may decide to use a line plot instead, just without lines. I.e. you may specify linestyle="none" and marker="o" to the plot, such that points appear on the plot.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))

colnames = list (df.columns)
df.reset_index().plot(x="index", y=colnames[1:], kind = 'line', legend=False, 
                 subplots = True, sharex = True, figsize = (5.5,4), ls="none", marker="o")

plt.show()

enter image description here

In case you absolutely need a scatter plot, you may create a subplots grid first and then iterate over the columns and axes to plot one scatter plot at a time to the respective axes.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))

colnames = list (df.columns)
fig, axes = plt.subplots(nrows=len(colnames)-1, sharex = True,figsize = (5.5,4),)

for i, ax in enumerate(axes):
    df.reset_index().plot(x="index", y=colnames[i+1], kind = 'scatter', legend=False, 
                          ax=ax, c=colnames[i+1], cmap="inferno")

plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.