2

My name is Luis Francisco Gomez and I am in the course Intermediate Python > 1 Matplotlib > Sizes that belongs to the Data Scientist with Python in DataCamp. I am reproducing the exercises of the course where in this part you have to make a scatter plot in which the size of the points are equivalent to the population of the countries. I try to reproduce the results of DataCamp with this code:

# load subpackage
import matplotlib.pyplot as plt

## load other libraries
import pandas as pd
import numpy as np

## import data
gapminder = pd.read_csv("https://assets.datacamp.com/production/repositories/287/datasets/5b1e4356f9fa5b5ce32e9bd2b75c777284819cca/gapminder.csv")
gdp_cap = gapminder["gdp_cap"].tolist()
life_exp = gapminder["life_exp"].tolist()

# create an np array that contains the population
pop = gapminder["population"].tolist()
pop_np = np.array(pop)


plt.scatter(gdp_cap, life_exp, s = pop_np*2)

# Previous customizations
plt.xscale('log') 
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007')
plt.xticks([1000, 10000, 100000],['1k', '10k', '100k'])

# Display the plot
plt.show()

However a get this:

enter image description here

But in theory you need to get this:

enter image description here

I don't understand what is the problem with the argument s in plt.scatter .

1
  • Sorry is plt.scatter(gdp_cap, life_exp, s = pop_np*2). I correct the mistake and the problem is the same. Commented Feb 20, 2020 at 22:09

3 Answers 3

2

You need to scale your s,

plt.scatter(gdp_cap, life_exp, s = pop_np*2/1000000)

The marker size in points**2. Per docs

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you Scoot Boston. This is really strange in the course DataCamp they don´t have this problem. Maybe they rescale the array and they don´t show that to the student. Thank you for you help. Sorry if this was an stupid question.
1

This is because your sizes are too large, scale it down. Also, there's no need to create all the intermediate arrays:

plt.scatter(gapminder.gdp_cap, 
            gapminder.life_exp, 
            s=gapminder.population/1e6)

Output:

enter image description here

1 Comment

Thank you for editing my post Quang Hoand in the case of the images and also for answering my question. Remember that I am new to Python so maybe in the process I make unnecessary steps because I am following an online course. Thank you very much again.
0

I think you should use

plt.scatter(gdp_cap, life_exp, s = gdp_cap*2)

or maybe reduce or scale pop_np

1 Comment

Thank you for your help @simomod. I had never used stackoverflow but this resource is fantastic.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.