Removing multiple rows of a dataframe with with same column value when bad data is found in another column

Question

I have a pandas dataframe, "tracks", that I'm filtering for erroneous altitude information. When the altitude is below a certain criteria, I want to throw out all rows that start with the same track_key. In the example, N123P, on track_key 4xuut, has an erroneous altitude, so I want to remove ALL rows that start with "4xuut", but NOT the rows below them that have the same call sign.

track_key	callsign	aircraft_type	speed	altitude
4xuut	N123P	C550	300	-1
4xuut	N123P	C550	297	15
4yt06	N123P	C550	305	1022
4yt06	N123P	C550	301	1028
4xx21	N348U	GALX	350	1025

I've tried this: tracks = tracks[tracks.track_key != tracks.loc[tracks['altitude'].astype('float') <= field_elev, 'track_key'].iloc[0]], but it only seems to work on the first match (there can be several), or, if there are no matches, I get an "out-of-bounds" error.

The output would be the same datafrme, with the first two rows removed, in this case. There could be a thousand such rows in the real data, — atc_ceedee
– atc_ceedee, Commented Feb 3, 2022 at 18:51

Park · Accepted Answer · 2022-02-03 18:54:46Z

1

The reason you see the error, out of bounds is because there is no value to access with an index 0 if there is no erroneous altitude value.

To solve the issue, I used if condition, as follows:

import pandas as pd

tracks = pd.DataFrame({
    'track_key': ['4xuut', '4xuut', '4yt06', '4yt06', '4xx21'],
    'callsign': ['N123P', 'N123P', 'N123P', 'N123P', 'N348U'],
    'aircraft_type': ['C550', 'C550', 'C550', 'C550', 'GALX'],
    'speed': [300, 297, 305, 301, 350],
    'altitude': [-1, 15, 1022, 1028, 1025],
})
#  track_key callsign aircraft_type  speed  altitude
#0     4xuut    N123P          C550    300        -1
#1     4xuut    N123P          C550    297        15
#2     4yt06    N123P          C550    305      1022
#3     4yt06    N123P          C550    301      1028
#4     4xx21    N348U          GALX    350      1025

erroneous = -1

key_to_delete = tracks[tracks['altitude'] == erroneous]['track_key'].values
if len(key_to_delete) > 0:
    tracks = tracks[~tracks['track_key'].str.startswith(key_to_delete[0])]

print(tracks)
#  track_key callsign aircraft_type  speed  altitude
#2     4yt06    N123P          C550    305      1022
#3     4yt06    N123P          C550    301      1028
#4     4xx21    N348U          GALX    350      1025

answered Feb 3, 2022 at 18:54

Park

2,5441 gold badge19 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

atc_ceedee Over a year ago

Sangkuen, thanks for the explanation on the out-of-bounds error. Makes sense, now.

atc_ceedee Over a year ago

This method actually worked a bit better. Thanks, again!

Park Over a year ago

@atc_ceedee happy to help u :)

Park Over a year ago

@atc_ceedee If this helped solve your problem, pls mark this answer as accepted, so that this can help other users know this question is solved by this answer.

Cam · Accepted Answer · 2022-02-03 18:51:12Z

1

Try this.

tracks[tracks.groupby('track_key').transform('min')['altitude']>0]

output

    track_key   callsign    aircraft_type   speed   altitude
2   4yt06       N123P       C550            305     1022
3   4yt06       N123P       C550            301     1028
4   4xx21       N348U       GALX            350     1025

Thanks to @bkeesey for this solution.

edited Feb 3, 2022 at 18:51

answered Feb 3, 2022 at 18:45

Cam

1,8651 gold badge23 silver badges34 bronze badges

2 Comments

atc_ceedee Over a year ago

Cam, that did it! Much simpler, as well. I swapped out the ">0" for "<field_elev", in my case. Some of my aircraft had ADS-B altitudes that were below the field elevation.

Cam Over a year ago

Yes transform is nice, see docs here: pandas.pydata.org/docs/reference/api/…

Collectives™ on Stack Overflow

Removing multiple rows of a dataframe with with same column value when bad data is found in another column

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related