1

Sorry to ask such a basic question, but after hours (and hours) of frustration I'm turning to the list for some expert help.

I have two pandas dataframes, df1 and df2. df1 has columns A and B, while df2 has columns C and D. I want to use matplotlib to make a scatterplot of A vs. B, with labelled axes, and a histogram of C, also with a title on the x axis. Then I want to save both figures in pdf files.

I can accomplish the former with

import matplotlib.pyplot as plt

plt.scatter(df1['A'],df1['B'])
plt.xlabel('X title')
plt.ylabel('Y title')
plt.savefig('myfig1.pdf')

But I can't get the histogram to work, and if it does, it creates a graph with both the scatterplot and the histogram in it.

Any help greatly appreciated.

3
  • Are you using plt.hist(df3[C'])? What's the error message with the histogram creation? Commented Jul 23, 2014 at 19:08
  • I get a long series of errors, ending with "KeyError: 0" Commented Jul 23, 2014 at 19:35
  • what kind of data is in df2['C'] ? Commented Jul 23, 2014 at 19:51

1 Answer 1

1

It sounds like you just need to make another figure for the histogram,

import matplotlib.pyplot as plt

fig1 = plt.figure()
plt.scatter(df1['A'],df1['B'])
plt.xlabel('X title')
plt.ylabel('Y title')
plt.savefig('myfig1.pdf')

fig2 = plt.figure()
... <histogram code>

Or you can assign the axes to variables so you dont have to do everything in order,

import random
x = [random.random() for i in range(50)]
y = [random.random() for i in range(50)]

fig1 = plt.figure()
ax1 = fig1.add_subplot(111)

fig2 = plt.figure()
ax2 = fig2.add_subplot(111)

ax1.scatter( x, y )
ax1.set_xlabel('X title')
ax1.set_ylabel('Y title')
fig1.savefig('myfig1.pdf')

ax2.hist( y )

Note that when setting properties of an axis using its methods, most of the plt attributes become set_X. For example, instead of plt.ylabel('my_y') you do ax1.set_ylabel('my_y'). You can still use the plt methods, but they will apply to whatever the current plot is. The variables ax1 and ax2 give you a little more freedom about when you do things.

Sign up to request clarification or add additional context in comments.

10 Comments

Thanks very much Gabriel. I can get the scatterplot to work with the fig, ax method. And I can get the histogram to work with the regular plt.hist() method, if I use df2['C'].values as the argument. But the histograms with fig, ax seem to be a problem, even with the values included.
what does typing df2['C'].values into an ipython session give you? what kind of data is it?
It gives me array([ 0.81132075, 0.73684211, 0.22222222, ..., 0.72727273, 0.625, 0.42857143]).
You know, it's possible that the problem is the IDE I'm using, which is Spyder. I tried just doing a basic histogram from the matplotlib documentation, and that didn't even work with the fig, ax notation.
yes, an array shouldn't give plt.hist or ax2.hist any problems unless there are some corrupted values in the array. Any NaNs or similar?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.