How to generate scatter plot of all numeric columns against specific columns in the same dataframe

Question

I have a dataframe with a mix of data types (object and numeric). I want to plot a scatter plot for all numeric columns in the dataset against specific columns: col_32, col_69,col_74 and col_80 thereby generating 4 figures for each of the numeric columns.

Example:

col_1 against col_32,col_69,col_74 and col_80 ( 4 plots)
col_2 against col_32,col_69,col_74 and col_80 (4 plots)
col_3 against col_32,col_69,col_74 and col_80 (4 plots)
...
col_85 against col_32,col_69,col_74 and col_80 (4 plots)

import pandas as pd 
from random import uniform
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import gmean


#Generate dataframe 

df = pd.DataFrame(
    data=np.random.uniform(low=5.5, high=30.75, size=(160, 84)),
    columns=[f'col_{i}' for i in range(1,85)],)

df.insert(
    loc=0, column='Location',
    value=np.repeat(['A','B','C','D'], 40, axis=0),)

# Insert NaN in the dataset just like the original dataset 
# Define the probability of introducing a NaN (e.g., 15%)
nan_probability = 0.15

np.random.seed(123)

df = df.mask(np.random.random(df.shape) < nan_probability)

# final dataset
df

I need help here, see my attempt below:

# select numeric columns 
numeric_cols = df.select_dtypes(include=['number']).columns.tolist()
print(f"Numeric columns: {numeric_cols}")

# create a list of specific columns col_32,col_69,col_74 and col_80
specific_x_cols = ['col_32','col_69','col_74','col_80']

for x_col in specific_x_cols:
    # Create a new figure for each  numeric column against the 4 specific_x_columns
    plt.subplots(nrows=2, ncols=2, figsize=(10, 8))
        
    
    for y_col in numeric_cols:
        if y_col != x_col: # Avoid plotting a column against itself
            
            sns.scatterplot(x =x_col, y = y_col,data=df)
            
    plt.title(f"Scatterplot of {y_col} against {x_col}")
   
    plt.xlabel(x_col)
    plt.ylabel("numeric columns")
    plt.grid(True)
    plt.legend()
    plt.savefig(f'{y_col}_scatterplot.png') # Save as a PNG file with a descriptive name
    plt.show()
    

print("scatterplot generated and saved successfully!")

Please share your code if you can

It's hard to know exactly what you want, so I can't provide it in code, but I'll leave you a tip. Using melt function to change the structure of your data will easily achieve what you want. — Panda Kim
– Panda Kim, Commented Oct 30 at 7:21
@PandaKim - I want a scatterplot of all the numeric columns in the dataframe against 4 target columns which are col_32,col_69,col_74 and col_80 (each numeric column vs 4 target columns, making a total 81 X 4 scatterplots — RayX500
– RayX500, Commented Oct 30 at 7:27

Corralien · Accepted Answer · 2025-10-30 09:45:31Z

2

You need to define the Axes on which you want to represent your data (not the fourth, the last one created). And IIUC, it seems to me that the logic needs to be reversed for nested loops:

specific_x_cols = ['col_32','col_69','col_74','col_80']
numeric_cols = df.select_dtypes(include=['number']).columns.tolist()

for y_col in numeric_cols:
    fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(10, 8))
    for x_col, ax in zip(specific_x_cols, axs.flatten()):
        sns.scatterplot(x=x_col, y=y_col, data=df, ax=ax)
        ax.set_title(f"Scatterplot of {y_col} against {x_col}")
        ax.grid()
        # customize the current Axes here
    # customize the current Figure here
    fig.set_tight_layout(True)

    # rest of your code here

Output:

answered Oct 30 at 9:45

Corralien

121k8 gold badges43 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Panda Kim Oct 30 at 10:01

I think this is an excellent answer, and after reviewing the OP's if, I have just one suggestion. numeric_cols = df.select_dtypes(include=['number']).columns.difference(specific_x_cols)

Abdullah Butt · Accepted Answer · 2025-10-30 07:06:05Z

0

import pandas as pd
import matplotlib.pyplot as plt
# df = pd.read_csv("your_data.csv")
target_col = "target_column"
# Select numeric columns except the target
numeric_cols = df.select_dtypes(include=['number']).columns.drop(target_col)
# Loop through and plot
for col in numeric_cols:
    plt.figure(figsize=(6, 4))
    plt.scatter(df[col], df[target_col], alpha=0.6)
    plt.title(f"{col} vs {target_col}")
    plt.xlabel(col)
    plt.ylabel(target_col)
    plt.grid(True)
    plt.show()

answered Oct 30 at 7:06

Abdullah Butt

12 bronze badges

2 Comments

RayX500 Oct 30 at 7:15

with your code, I got a ValueError: x and y must be the same size . I replaced ` target_col = 'target_column" ` with target_col = ['col_32','col_69','col_74','col_80']

Community Oct 30 at 7:15

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

How to generate scatter plot of all numeric columns against specific columns in the same dataframe

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related