Basic Pandas matplotlib plotting

Question

Sorry to ask such a basic question, but after hours (and hours) of frustration I'm turning to the list for some expert help.

I have two pandas dataframes, df1 and df2. df1 has columns A and B, while df2 has columns C and D. I want to use matplotlib to make a scatterplot of A vs. B, with labelled axes, and a histogram of C, also with a title on the x axis. Then I want to save both figures in pdf files.

I can accomplish the former with

import matplotlib.pyplot as plt

plt.scatter(df1['A'],df1['B'])
plt.xlabel('X title')
plt.ylabel('Y title')
plt.savefig('myfig1.pdf')

But I can't get the histogram to work, and if it does, it creates a graph with both the scatterplot and the histogram in it.

Any help greatly appreciated.

Are you using plt.hist(df3[C'])? What's the error message with the histogram creation? — N1B4
– N1B4, Commented Jul 23, 2014 at 19:08

Gabriel · Accepted Answer · 2014-07-23 19:50:01Z

1

It sounds like you just need to make another figure for the histogram,

import matplotlib.pyplot as plt

fig1 = plt.figure()
plt.scatter(df1['A'],df1['B'])
plt.xlabel('X title')
plt.ylabel('Y title')
plt.savefig('myfig1.pdf')

fig2 = plt.figure()
... <histogram code>

Or you can assign the axes to variables so you dont have to do everything in order,

import random
x = [random.random() for i in range(50)]
y = [random.random() for i in range(50)]

fig1 = plt.figure()
ax1 = fig1.add_subplot(111)

fig2 = plt.figure()
ax2 = fig2.add_subplot(111)

ax1.scatter( x, y )
ax1.set_xlabel('X title')
ax1.set_ylabel('Y title')
fig1.savefig('myfig1.pdf')

ax2.hist( y )

Note that when setting properties of an axis using its methods, most of the plt attributes become set_X. For example, instead of plt.ylabel('my_y') you do ax1.set_ylabel('my_y'). You can still use the plt methods, but they will apply to whatever the current plot is. The variables ax1 and ax2 give you a little more freedom about when you do things.

edited Jul 23, 2014 at 19:50

answered Jul 23, 2014 at 19:08

Gabriel

11k1 gold badge26 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

David Pepper Over a year ago

Thanks very much Gabriel. I can get the scatterplot to work with the fig, ax method. And I can get the histogram to work with the regular plt.hist() method, if I use df2['C'].values as the argument. But the histograms with fig, ax seem to be a problem, even with the values included.

Gabriel Over a year ago

what does typing df2['C'].values into an ipython session give you? what kind of data is it?

David Pepper Over a year ago

It gives me array([ 0.81132075, 0.73684211, 0.22222222, ..., 0.72727273, 0.625, 0.42857143]).

David Pepper Over a year ago

You know, it's possible that the problem is the IDE I'm using, which is Spyder. I tried just doing a basic histogram from the matplotlib documentation, and that didn't even work with the fig, ax notation.

Gabriel Over a year ago

yes, an array shouldn't give plt.hist or ax2.hist any problems unless there are some corrupted values in the array. Any NaNs or similar?

|

Collectives™ on Stack Overflow

Basic Pandas matplotlib plotting

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related