2

I have an excel file which looks like this

enter image description here

When I read this whith pandas.read_excel pandas returns a df which looks like this:

                                        1998 Unnamed: 1  1999 Unnamed: 3  \
Angélus                                   20        -35    16         au   
Angludet                                  17         au    16         vo   
Arnaud de Jacquemeau                      16         vo    16         vo   
Ausone                                    20        -40    18        -25   
Barde-Haut                                17         au    17         vo   

Is there a way to tell pandas about the multicolumn so that the output is either

                                        1998       1998  1999       1999
Angélus                                   20        -35    16         au   
Angludet                                  17         au    16         vo   
Arnaud de Jacquemeau                      16         vo    16         vo   
Ausone                                    20        -40    18        -25   
Barde-Haut                                17         au    17         vo   

or

                                               1998            1999
Angélus                                   20        -35    16         au   
Angludet                                  17         au    16         vo   
Arnaud de Jacquemeau                      16         vo    16         vo   
Ausone                                    20        -40    18        -25   
Barde-Haut                                17         au    17         vo  

?

Thx Patrik

2 Answers 2

1

You could try:

df.columns = df.columns.to_series().str.replace(r'^Unnamed', np.nan).fillna(method='ffill').tolist()
Sign up to request clarification or add additional context in comments.

Comments

1

You would need to create a new column list & then redefine the column names like below :

df.columns = df.columns.astype(str)    
new_columns = [df.columns[i-1]  if df.columns[i].find("Unnamed") >= 0 else df.columns[i] for i in range(len(df.columns))]
df.columns = new_columns

or you could do it in a single line by

df.columns = [df.columns[i-1]  if df.columns[i].find("Unnamed") >= 0 else df.columns[i] for i in range(len(df.columns))]

2 Comments

Thats a nice one. Thanks
@Pat if this resolves you problem, please mark this as the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.