2

Everytime I use pandas profiling in different data sets, notebook shows me this error.

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices.

import pandas as pd

df = pd.read_csv('H:\DATA Sets\cereal.csv')

from pandas_profiling import ProfileReport

profile = ProfileReport(df,title='cereal-eda',html={'style' : {'full_width':True}})

dataset used - cereal.csv from kaggle https://www.kaggle.com/crawford/80-cereals

2
  • df is not defined... Commented Jan 29, 2022 at 18:08
  • oh yes typing mistake... but still getting error Commented Jan 29, 2022 at 18:42

1 Answer 1

6

Edit: A PR has already been made to fix this. It seems to be an issue using Pandas 1.4.[01] See this issue on pandas-profiling's github.

I think the error occurs because Numpy deprecated indexing arrays in a manner used by one of pandas-profiling's modules.

If you are getting the same traceback I'm getting where this error occurs in pandas_profiling.model.pandas.utils_pandas, you should be able to fix this by changing:

w_median = data[weights == np.max(weights)][0]

to

w_median = data[np.where(weights == np.max(weights))][0]

In the weighted_median function in $(YOUR_VIRTUAL_ENVIRONMENT_OR_PYTHON_DIR)/lib/python$(PYVERSION)/site-packages/pandas-profiling/model/pandas/utils_pandas.py

(line 13 for pandas-profiling version 3.1.0)

Sign up to request clarification or add additional context in comments.

2 Comments

This is still an issue and this fixed it for me with: pandas==1.4.1 pandas-profiling==3.1.0
pandas==1.4.1 and profiling==3.1.0 did not fix for me, but the code hack above did.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.