2

The single index of df1 matches with a sublevel of multiindex of df2. Both have the same columns. I want to copy all rows and columns of df1 to df2.

It is similar to this thread: copying a single-index DataFrame into a MultiIndex DataFrame

But that solution only work for one index value, the index 'a' in that case. I want to do this operation for all index of df1.

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: import itertools
In [4]: inner = ('a','b')
In [5]: outer = ((10,20), (1,2))
In [6]: cols = ('one','two','three','four')
In [7]: sngl = pd.DataFrame(np.random.randn(2,4), index=inner, columns=cols)
In [8]: index_tups = list(itertools.product(*(outer + (inner,))))
In [9]: index_mult = pd.MultiIndex.from_tuples(index_tups)
In [10]: mult = pd.DataFrame(index=index_mult, columns=cols)
In [11]: sngl
Out[11]: 
        one       two     three      four
a  2.946876 -0.751171  2.306766  0.323146
b  0.192558  0.928031  1.230475 -0.256739

In [12]: mult
Out[12]: 
        one  two three four
10 1 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
   2 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
20 1 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
   2 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN


In [13]: mult.ix[(10,1)] = sngl

In [14]: mult
Out[14]: 
        one  two three four
10 1 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
   2 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
20 1 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN
   2 a  NaN  NaN   NaN  NaN
     b  NaN  NaN   NaN  NaN

The solution given by @Jeff is

nm = mult.reset_index().set_index('level_2')
nm.loc['a',sngl.columns] = sngl.loc['a'].values

         level_0  level_1        one        two     three        four
level_2                                                              
a             10        1  0.3738456 -0.2261926 -1.205177  0.08448757
b             10        1        NaN        NaN       NaN         NaN
a             10        2  0.3738456 -0.2261926 -1.205177  0.08448757
b             10        2        NaN        NaN       NaN         NaN
a             20        1  0.3738456 -0.2261926 -1.205177  0.08448757
b             20        1        NaN        NaN       NaN         NaN
a             20        2  0.3738456 -0.2261926 -1.205177  0.08448757
b             20        2        NaN        NaN       NaN         NaN

I can't do this:

nm.loc[:,sngl.columns] = sngl.loc[:].values

It will raise ValueError: "cannot copy sequence with size X to array axis with dimension Y"

I am currently using a loop. But this is not the pandas way.

1 Answer 1

1

This feels a little too manual, but in practice I might do something like this:

In [46]: mult[:] = sngl.loc[mult.index.get_level_values(2)].values

In [47]: mult
Out[47]: 
             one       two     three      four
10 1 a  1.175042  0.044014  1.341404 -0.223872
     b  0.216168 -0.748194 -0.546003 -0.501149
   2 a  1.175042  0.044014  1.341404 -0.223872
     b  0.216168 -0.748194 -0.546003 -0.501149
20 1 a  1.175042  0.044014  1.341404 -0.223872
     b  0.216168 -0.748194 -0.546003 -0.501149
   2 a  1.175042  0.044014  1.341404 -0.223872
     b  0.216168 -0.748194 -0.546003 -0.501149

That is, first select the elements we want to use to index:

In [64]: mult.index.get_level_values(2)
Out[64]: Index(['a', 'b', 'a', 'b', 'a', 'b', 'a', 'b'], dtype='object')

Then use these to index into sngl:

In [65]: sngl.loc[mult.index.get_level_values(2)]
Out[65]: 
        one       two     three      four
a  1.175042  0.044014  1.341404 -0.223872
b  0.216168 -0.748194 -0.546003 -0.501149
a  1.175042  0.044014  1.341404 -0.223872
b  0.216168 -0.748194 -0.546003 -0.501149
a  1.175042  0.044014  1.341404 -0.223872
b  0.216168 -0.748194 -0.546003 -0.501149
a  1.175042  0.044014  1.341404 -0.223872
b  0.216168 -0.748194 -0.546003 -0.501149

and then we can use .values to throw away the indexing information and just get the raw array to fill with.

It's not very elegant, but it's straightforward.

Sign up to request clarification or add additional context in comments.

1 Comment

cool! What if add an additional hierarchical index to both sng1 and mult? mult.index.get_level_values(2) can only work for one level. sng2=pd.concat([sng1,sng1],keys=['X','Y']), mult2=pd.concat([mult,mult],keys=['X','Y']). To make this thread clear, I posted a new one: stackoverflow.com/questions/32748910/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.