Creating DataFrame with Hierarchical Columns

Question

What is the easiest way to create a DataFrame with hierarchical columns?

I am currently creating a DataFrame from a dict of names -> Series using:

df = pd.DataFrame(data=serieses)

I would like to use the same columns names but add an additional level of hierarchy on the columns. For the time being I want the additional level to have the same value for columns, let's say "Estimates".

I am trying the following but that does not seem to work:

pd.DataFrame(data=serieses,columns=pd.MultiIndex.from_tuples([(x, "Estimates") for x in serieses.keys()]))

All I get is a DataFrame with all NaNs.

For example, what I am looking for is roughly:

l1               Estimates    
l2  one  two  one  two  one  two  one  two
r1   1    2    3    4    5    6    7    8
r2   1.1  2    3    4    5    6    71   8.2

where l1 and l2 are the labels for the MultiIndex

Alex Rothberg · Accepted Answer · 2013-08-02 02:13:37Z

16

This appears to work:

import pandas as pd

data = {'a': [1,2,3,4], 'b': [10,20,30,40],'c': [100,200,300,400]}

df = pd.concat({"Estimates": pd.DataFrame(data)}, axis=1, names=["l1", "l2"])

l1  Estimates         
l2          a   b    c
0           1  10  100
1           2  20  200
2           3  30  300
3           4  40  400

answered Aug 2, 2013 at 2:13

Alex Rothberg

11.2k15 gold badges69 silver badges126 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rutger Kassies Over a year ago

Thats very readable, i like it. Ultimately it might be best for Pandas to have better 'level' management, like a simple df.add_level(axis=1).

DimG · Accepted Answer · 2017-03-20 07:24:37Z

13

I know the question is really old but for pandas version 0.19.1 one can use direct dict-initialization:

d = {('a','b'):[1,2,3,4], ('a','c'):[5,6,7,8]}
df = pd.DataFrame(d, index=['r1','r2','r3','r4'])
df.columns.names = ('l1','l2')
print df

l1  a   
l2  b  c
r1  1  5
r2  2  6
r3  3  7
r4  4  8

answered Mar 20, 2017 at 7:24

DimG

1,7912 gold badges17 silver badges25 bronze badges

2 Comments

zkytony Over a year ago

Does this still work? I tried direct dict initialization but the columns are just tuples

DimG Over a year ago

@zkytony, I've checked that just now with a not-so-old 1.2.0 version and the thing still holds, at least on my machine. Have you tried upgrading your pandas installation? P.S same for the latest 1.3.3

Rutger Kassies · Accepted Answer · 2013-08-01 12:49:14Z

2

Im not sure but i think the use of a dict as input for your DF and a MulitIndex dont play well together. Using an array as input instead makes it work.

I often prefer dicts as input though, one way is to set the columns after creating the df:

import pandas as pd

data = {'a': [1,2,3,4], 'b': [10,20,30,40],'c': [100,200,300,400]}
df = pd.DataFrame(np.array(data.values()).T, index=['r1','r2','r3','r4'])

tups = zip(*[['Estimates']*len(data),data.keys()])

df.columns = pd.MultiIndex.from_tuples(tups, names=['l1','l2'])

l1          Estimates         
l2          a   c    b
r1          1  10  100
r2          2  20  200
r3          3  30  300
r4          4  40  400

Or when using an array as input for the df:

data_arr = np.array([[1,2,3,4],[10,20,30,40],[100,200,300,400]])

tups = zip(*[['Estimates']*data_arr.shape[0],['a','b','c'])
df = pd.DataFrame(data_arr.T, index=['r1','r2','r3','r4'], columns=pd.MultiIndex.from_tuples(tups, names=['l1','l2']))

Which gives the same result.

edited Aug 1, 2013 at 12:49

answered Aug 1, 2013 at 6:27

Rutger Kassies

65k17 gold badges119 silver badges102 bronze badges

5 Comments

Alex Rothberg Over a year ago

Is there a risk that the column ordering will be messed up in the dict example? In other words when Pandas makes the DataFrame from a dict, it must pull the keys/values out of the dict which will happen in arbitrary order. I think you assume the same order in the up/list comprehension statement. This seems long term unsafe. I believe that when the columns keyword is set in DataFrame construction, Pandas attemtps to ensure some sort of alignment.

Rutger Kassies Over a year ago

Good point, you want to avoid that indeed. Using np.array(data.values()).T together with data.keys() should be fine i guess.

Alex Rothberg Over a year ago

According to docs, docs.python.org/2/library/stdtypes.html#dict.items, that new proposal does in fact seem safe.

Alex Rothberg Over a year ago

Is there any concern with calling transpose? For example. are there any cases in which dtypes gets messed up?

Alex Rothberg Over a year ago

Do you think that it would make sense to allow creating this by creating a DataFrame of DataFrames? For example: pd.DataFrame({"Extimates":pd.DataFrame(data)}) ?

zkytony · Accepted Answer · 2021-09-15 13:21:47Z

2

The solution by Rutger Kassies worked in my case, but I have more than one column in the "upper level" of the column hierarchy. Just want to provide what worked for me as an example since it is a more general case.

First, I have data with that looks like this:

> df
         (A, a)    (A, b)       (B, a)    (B, b) 
0         0.00     9.75         0.00       0.00
1         8.85     8.86         35.75      35.50
2         8.51     9.60         66.67      50.70
3         0.03     508.99       56.00      8.58

I would like it to look like this:

> df
                A                    B
           a        b            a          b
0         0.00     9.75         0.00       0.00
1         8.85     8.86         35.75      35.50
...

The solution is:

tuples = df.transpose().index
new_columns = pd.MultiIndex.from_tuples(tuples, names=['Upper', 'Lower'])
df.columns = new_columns

This is counter-intuitive because in order to create columns, I have to do it through index.

answered Sep 15, 2021 at 13:21

zkytony

1,5382 gold badges21 silver badges38 bronze badges

1 Comment

Joan Marcè i Igual Over a year ago

You could also do: new_columns = pd.MultiIndex.from_tuples(df.columns, names=['Upper', 'Lower']); df.columns = new_columns

Collectives™ on Stack Overflow

Creating DataFrame with Hierarchical Columns

4 Answers 4

1 Comment

2 Comments

5 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

2 Comments

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related