Python DataFrame TypeError: only integer scalar arrays can be converted to a scalar index

Question

I know there are several questions about this error already. But in this particular case I'm not sure whether there is already a solution for my problem. I have this part of code and i want to print the column "y" of the Dataframe df. The following error occurs: TypeError: only integer scalar arrays can be converted to a scalar index

labels=[]
xvectors=[]

for i in data:
    labels.append(i[0])
    xvectors.append(i[1])

X = np.array(xvectors)
y = np.array(labels)

feat_cols = [ 'xvec'+str(i) for i in range(X.shape[1]) ]
print(feat_cols)
df = pd.DataFrame(X,columns=[feat_cols])
df['y']= y
#df['label'] = df['y'].apply(lambda i: str(i))
print(df['y'])

X, y = None, None

Printing the whole DataFrame is possible. This looks like:

        xvec0     xvec1     xvec2     xvec3     xvec4  ...   xvec508   xvec509   xvec510   xvec511        y
0    3.397163 -1.112423  0.414708  0.563083  1.371336  ...  1.201095 -0.076261 -0.620443 -1.231465  DA01_03
1    0.159473  1.884818 -1.511547 -0.153500 -0.635701  ... -1.217205 -1.922081  0.878613  0.087912  DA01_06
2    1.089404  0.331919 -1.027480  0.594129 -2.473234  ... -3.505570 -3.509632 -0.553128 -0.453307  DA01_10
3    0.183993 -1.741467 -0.142570 -3.158320  4.355789  ...  3.857311  3.142393  0.991663 -2.842322  DA01_14

This is the whole errror message:

    print(df['y'])
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/frame.py", line 2958, in __getitem__
    return self._get_item_cache(key)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py", line 3270, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/internals/managers.py", line 960, in get
    return self.iget(loc)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/internals/managers.py", line 977, in iget
    block = self.blocks[self._blknos[i]]
TypeError: only integer scalar arrays can be converted to a scalar index

I think it has something to do with the numpy array. Thank you in advance!

The same error occurs when printing any other column. In i[0] is a string like the most right column. It looks like this: DA01_03 DA01_06 DA01_10 DA01_14 — 3r1c
– 3r1c, Commented Apr 10, 2020 at 16:59
print(dfA.reset_index()) maybe? Looks indeed to be due to how X is constructed. dfA.info() might give us a better hint. — gosuto
– gosuto, Commented Apr 10, 2020 at 17:07
thanks for your answer. df.info() gives: <class 'pandas.core.frame.DataFrame'> RangeIndex: 109 entries, 0 to 108 Columns: 513 entries, (xvec0,) to (y,) dtypes: float32(512), object(1) memory usage: 219.0+ KB None reset_index() returns only the df. it doesnt fix the problem. — 3r1c
– 3r1c, Commented Apr 10, 2020 at 17:14

gosuto · Accepted Answer · 2020-04-10 17:18:26Z

2

Ah you pass your columns argument as a list in a list (feat_cols is already of type list). This turns your column headers 2-dimensional: you can see df.info() says it ranges from (xvec0,) to ... instead of xvec0.

Passing columns=feat_cols should do the trick :-)

answered Apr 10, 2020 at 17:18

gosuto

5,8316 gold badges42 silver badges61 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

3r1c Over a year ago

ou my god. I didn't see that! Thank you!

jim washer Over a year ago

Wow, I just just made the exact same mistake. Thanks for having answered this!!

Collectives™ on Stack Overflow

Python DataFrame TypeError: only integer scalar arrays can be converted to a scalar index

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related