How to create multiple dataframe from a excel data table

Question

I have extracted this data frame from an excel spreadsheet using pandas library, after getting the needed columns and, I have table formatted like this,

    REF PLAYERS
0   103368  Andrés Posada Sanmiguel
1   300552  Diego Posada Sanmiguel
2   103304  Roberto Motta Stanziola
3   NaN NaN
4   REF PLAYERS
5   1047012 ANABELLA EISMANN DE AMAYA
6   104701  FERNANDO ENRIQUE AMAYA CASTRO
7   103451  AUGUSTO ANTONIO ALVARADO AZCARRAGA
8   103484  Kevin Adrian Villarreal Kam
9   REF PLAYERS
10  NaN NaN
11  NaN NaN
12  NaN NaN
13  NaN NaN
14  REF PLAYERS
15  NaN NaN
16  NaN NaN
17  NaN NaN
18  NaN NaN
19  REF PLAYERS

I want to create multiple dataframes converting each row [['REF', 'PLAYERS']] to a new dataframe columns. suggestions are welcomed I also need to preserve the blank spaces. A pandas newbie.

Based on this example, do you want to have 20 empty dataframes with rows as column names. Am I right? — Yulian
– Yulian, Commented Feb 20, 2021 at 13:09
This can be done comparing the row value to the "REF", "PLAYERS", then groupby on the cumcount. I need few minutes. — Yulian
– Yulian, Commented Feb 20, 2021 at 13:13
so each dataframe should just have 1 row? the REF and the PLAYER? — sophocles
– sophocles, Commented Feb 20, 2021 at 13:15
each dataframe will have 2 columns, 4 rows. the empty values should be intact. — Onyilimba
– Onyilimba, Commented Feb 20, 2021 at 13:18

Yulian · Accepted Answer · 2021-02-20 13:53:37Z

1

For this to work, you must first read the dataframe from the file differently: set the argument header=None in your pd.read_excel() function. Because now your columns are called "REF" and "PLAYERS", but we would like to group by them.

Then the first column name probably would be "0", and the first line will be as follows, where the df is the name of your dataframe:

# Set unique index for each group
df["group_id"] = (df[0] == "REF").cumsum()

Solution:

# Set unique index for each group
df["group_id"] = (df["name_of_first_column"] == "REF").cumsum()

# Iterate over groups
dataframes = []
for name, group in df.groupby("group_id"):
    df_ = group
    # promote 1st row to column name
    df_.columns = df_.iloc[0]
    # and drop it
    df_ = df_.iloc[1:]
    # drop index column
    df_ = df_[["REF", "PLAYERS"]]
    # append to the list of dataframes
    dataframes.append(df_)

All your multiple dataframes are now stored in an array dataframes.

edited Feb 20, 2021 at 13:53

answered Feb 20, 2021 at 13:31

Yulian

3654 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Onyilimba Over a year ago

This Works, I will have to do some tweaking to it.

sophocles · Accepted Answer · 2021-02-20 13:36:27Z

1

You can split your dataframe, into equal lengths (in your case 4 rows for each df), using np.split.

Since you want 4 rows per dataframe, you can split it into 5 different df:

import numpy as np
dfs = [df.loc[idx] for idx in np.split(df.index,5)]

And then create your individual dataframes:

df1 = dfs[1]
df1

                                  REF PLAYERS
4                                 REF PLAYERS
5           1047012 ANABELLA EISMANN DE AMAYA
6       104701  FERNANDO ENRIQUE AMAYA CASTRO
7  103451  AUGUSTO ANTONIO ALVARADO AZCARRAGA


df2 = dfs[2]
df2
                            REF PLAYERS
8   103484  Kevin Adrian Villarreal Kam
9                           REF PLAYERS
10                              NaN NaN
11                              NaN NaN

answered Feb 20, 2021 at 13:36

sophocles

13.9k3 gold badges18 silver badges37 bronze badges

2 Comments

Yulian Over a year ago

Yes, this will also work, if only there are always 4 rows per dataframe, as in the given expamle.

sophocles Over a year ago

I thought that is what you're after?

Collectives™ on Stack Overflow

How to create multiple dataframe from a excel data table

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related