I have a code to transofrm an excel file with data whioch has two headers into a dataframe. Somehow my code is not doing that what i expect:
import pandas as pd
df = pd.read_excel('excel_file.xlsx', sheet_name='Class', header=None, names= header_names, skiprows=2)
somehow I receive a weird output:
NaN NaN NaT NaN NaN NaT
3 NaN NaT
2015-06-30 4 NaN
etc.
I dont know what I am doing wrong.
My excel file is structured as the following but has even more columns.
A B C
math biology physics
01.06.2015 3 01.06.2015 3 01.06.2015 3
30.06.2016 4 30.06.2016 4 30.06.2016 4
15.06.2017 1 15.06.2017 3 15.06.2017 3
20.06.2018 3 20.06.2018 2 20.06.2018 5
15.06.2019 2 15.06.2019 3 15.06.2019 2
The expected dataframe output should be like this:
date grade course group
01.06.2015 3 math A
30.06.2016 4 math A
15.06.2017 1 math A
20.06.2018 3 math A
15.06.2019 2 math A
01.06.2015 3 biology B
30.06.2016 4 biology B
15.06.2017 3 biology B
20.06.2018 2 biology B
15.06.2019 3 biology B
01.06.2015 3 physics C
30.06.2016 4 physics C
15.06.2017 3 physics C
20.06.2018 5 physics C
15.06.2019 2 physics C