How to remove multiple headers from dataframe and keeps just the first python

Question

I'm working with a csv file that presents multiple headers, all are repeated like in this example:

1                     2     3   4
0            POSITION_T  PROB  ID  
1                 2.385   2.0   1  
2            POSITION_T  PROB  ID 
3                 3.074   6.0   3  
4                 6.731   8.0   4    
6            POSITION_T  PROB  ID  
7                12.508   2.0   1  
8                12.932   4.0   2  
9                12.985   4.0   2

I want to remove the duplicated headers to get the a final document like this:

0            POSITION_T  PROB  ID  
1                 2.385   2.0   1   
3                 3.074   6.0   3  
4                 6.731   8.0   4     
7                12.508   2.0   1  
8                12.932   4.0   2  
9                12.985   4.0   2

The way in which I trying to remove these headers is by using:

df1 = [df!='POSITION_T'][df!='PROB'][df!='ID']

But that produce the error TypeError: Could not compare ['TRACK_ID'] with block values Some ideas? thanks in advance!

What does the actual text file look like?

piRSquared
– piRSquared

2017-09-01 16:10:21 +00:00
Commented Sep 1, 2017 at 16:10 — piRSquared
– piRSquared, Commented Sep 1, 2017 at 16:10

RomanPerekhrest · Accepted Answer · 2017-09-01 16:29:11Z

4

Filtering out by field value:

df = pd.read_table('yourfile.csv', header=None, delim_whitespace=True, skiprows=1)
df.columns = ['0','POSITION_T','PROB','ID']
del df['0']

# filtering out the rows with `POSITION_T` value in corresponding column
df = df[df.POSITION_T.str.contains('POSITION_T') == False]

print(df)

The output:

  POSITION_T PROB ID
1      2.385  2.0  1
3      3.074  6.0  3
4      6.731  8.0  4
6     12.508  2.0  1
7     12.932  4.0  2
8     12.985  4.0  2

answered Sep 1, 2017 at 16:29

RomanPerekhrest

93.1k4 gold badges75 silver badges112 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Henul Over a year ago

I have a similar problem here stackoverflow.com/q/68705981/16421119

Henul Over a year ago

Would really appreciate your help!

Dimanjan · Accepted Answer · 2022-04-13 03:59:10Z

3

To keep the bottom level column names only:

df.columns=[multicols[-1] for multicols in df.columns]

answered Apr 13, 2022 at 3:59

Dimanjan

6337 silver badges14 bronze badges

1 Comment

truckbot Over a year ago

This worked for me (though in my use case, I was taking only the top level)

piRSquared · Accepted Answer · 2017-09-01 16:18:59Z

1

This is not ideal! The best way to deal with this would be to handle it in the file parsing.

mask = df.iloc[:, 0] == 'POSITION_T'
d1 = df[~mask]
d1.columns = df.loc[mask.idxmax].values

d1 = d1.apply(pd.to_numeric, errors='ignore')
d1

   POSITION_T  PROB  ID
1                      
1       2.385   2.0   1
3       3.074   6.0   3
4       6.731   8.0   4
7      12.508   2.0   1
8      12.932   4.0   2
9      12.985   4.0   2

answered Sep 1, 2017 at 16:18

piRSquared

296k68 gold badges509 silver badges654 bronze badges

1 Comment

Henul Over a year ago

Hi @pirsquared, I have a similar issue here. stackoverflow.com/q/68705981/16421119 Would really appreciate your help!

stealthyninja · Accepted Answer · 2020-03-27 21:34:34Z

0

past_data=pd.read_csv("book.csv")

past_data = past_data[past_data.LAT.astype(str).str.contains('LAT') == False]

print(past_data)

Replace the CSV (here: book.csv)
Replace your variable names (here: past_data)
Replace all the LAT with your any of your column name
That's All/ your multiple headers will be removed

edited Mar 27, 2020 at 21:34

stealthyninja

10.4k11 gold badges56 silver badges61 bronze badges

answered Mar 27, 2020 at 20:26

Diptesh Chakraborty

4205 silver badges6 bronze badges

Collectives™ on Stack Overflow

How to remove multiple headers from dataframe and keeps just the first python

4 Answers 4

2 Comments

1 Comment

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related