1

I am running Python 3.6 with Pandas version 0.19.2. On the code example below, I have two questions regarding the Pandas plotting function scatter_matrix():

**1.**How can I colour-label the observations in the scatter plots with respect to the Label column?

**2.**How can I specify the number of bins for the histograms on the diagonal? Can I do this individually or just one bin number for all?

import pandas as pd
import numpy as np

N= 1000
df_feat = pd.DataFrame(np.random.randn(N, 4), columns=['A','B','C','D'])
df_label = pd.DataFrame(np.random.choice([0,1], N), columns=['Label'])
df = pd.concat([df_feat, df_label], axis=1)
axes = pd.tools.plotting.scatter_matrix(df, alpha=0.2)

This is linked to this more general one.

1 Answer 1

4

To answer your first question, there may be a less 'kludgey' way, but

scatter_matrix(df,c=['r' if i == 1 else 'b' for i in df['Label']])

To answer the second:

The scatter matrix can use the pd.hist() api to use hist keywords passed in a dictionary

scatter_matrix(df,hist_kwds={'bins':5})

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, I will try these out when I'm next to a computer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.