Appending a numpy array to a multiindex dataframe

Question

I have some trouble populating a pandas DataFrame. I am following the instructions found here to produce a MultiIndex DataFrame. The example work fine except that I want to have an array instead of a single value.

activity = 'Open_Truck'
id = 1
index = pd.MultiIndex.from_tuples([(activity, id)], names=['activity', 'id'])
v = pd.Series(np.random.randn(1, 5), index=index)

Exception: Data must be 1-dimensional

If I replace randn(1, 5) with randn(1) it works fine. For randn(1, 1) I should use randn(1, 1).flatten('F') but also works. When trying:

v = pd.Series(np.random.randn(1, 5).flatten('F'), index=index)

ValueError: Wrong number of items passed 5, placement implies 1

My intention is to add 1 feature vector (they are np.array of course in real case scenario and not np.random.randn) for each activity and id in each row.
So, How do I manage to add an array in a MultiIndex DataFrame?

Edit:
As I am new to pandas I mixed Series with DataFrame. I can achieve the above using DataFrame which is two-dimensional by default:

arrays = [np.array(['Open_Truck']*2),
            np.array(['1', '2'])]
df = pd.DataFrame(np.random.randn(2, 4), index=arrays)
df
               0         1         2         3
Open 1 -0.210923  0.184874 -0.060210  0.301924
     2  0.773249  0.175522 -0.408625 -0.331581

I see your edit, there is same principe, need index with same length as MultiIndex. — jezrael
– jezrael, Commented May 14, 2018 at 9:18

jezrael · Accepted Answer · 2018-05-14 08:00:54Z

1

There is problem MultiIndex has only one tuple and data length is different, 5 so lengths not match:

activity = 'Open_Truck'
id = 1
#get 5 times tuples
index = pd.MultiIndex.from_tuples([(activity, id)] * 5, names=['activity', 'id'])
print (index)
MultiIndex(levels=[['Open_Truck'], [1]],
           labels=[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]],
           names=['activity', 'id'])

print (len(index))
5

v = pd.Series(np.random.randn(1, 5).flatten('F'), index=index)
print (v)
activity    id
Open_Truck  1    -1.348832
            1    -0.706780
            1     0.242352
            1     0.224271
            1     1.112608
dtype: float64

In first aproach lengths are same, 1, because one tuple in list:

activity = 'Open_Truck'
id = 1
index = pd.MultiIndex.from_tuples([(activity, id)], names=['activity', 'id'])

print (len(index))
1

v = pd.Series(np.random.randn(1), index=index)
print (v)
activity    id
Open_Truck  1    -1.275131
dtype: float64

answered May 14, 2018 at 8:00

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Eypros Over a year ago

Yeah, but in this way I get an array column-wise. Is there any way to get it row-wise? When I append new arrays they are added below the previous one.

Collectives™ on Stack Overflow

Appending a numpy array to a multiindex dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related