0

I have a pandas dataframe, "tracks", that I'm filtering for erroneous altitude information. When the altitude is below a certain criteria, I want to throw out all rows that start with the same track_key. In the example, N123P, on track_key 4xuut, has an erroneous altitude, so I want to remove ALL rows that start with "4xuut", but NOT the rows below them that have the same call sign.

track_key callsign aircraft_type speed altitude
4xuut N123P C550 300 -1
4xuut N123P C550 297 15
4yt06 N123P C550 305 1022
4yt06 N123P C550 301 1028
4xx21 N348U GALX 350 1025

I've tried this: tracks = tracks[tracks.track_key != tracks.loc[tracks['altitude'].astype('float') <= field_elev, 'track_key'].iloc[0]], but it only seems to work on the first match (there can be several), or, if there are no matches, I get an "out-of-bounds" error.

3
  • Helps is you show what you want the output to look like. Commented Feb 3, 2022 at 18:42
  • The output would be the same datafrme, with the first two rows removed, in this case. There could be a thousand such rows in the real data, Commented Feb 3, 2022 at 18:51
  • Try solution below, that should do it. Commented Feb 3, 2022 at 18:52

2 Answers 2

1

The reason you see the error, out of bounds is because there is no value to access with an index 0 if there is no erroneous altitude value.

To solve the issue, I used if condition, as follows:

import pandas as pd

tracks = pd.DataFrame({
    'track_key': ['4xuut', '4xuut', '4yt06', '4yt06', '4xx21'],
    'callsign': ['N123P', 'N123P', 'N123P', 'N123P', 'N348U'],
    'aircraft_type': ['C550', 'C550', 'C550', 'C550', 'GALX'],
    'speed': [300, 297, 305, 301, 350],
    'altitude': [-1, 15, 1022, 1028, 1025],
})
#  track_key callsign aircraft_type  speed  altitude
#0     4xuut    N123P          C550    300        -1
#1     4xuut    N123P          C550    297        15
#2     4yt06    N123P          C550    305      1022
#3     4yt06    N123P          C550    301      1028
#4     4xx21    N348U          GALX    350      1025

erroneous = -1

key_to_delete = tracks[tracks['altitude'] == erroneous]['track_key'].values
if len(key_to_delete) > 0:
    tracks = tracks[~tracks['track_key'].str.startswith(key_to_delete[0])]

print(tracks)
#  track_key callsign aircraft_type  speed  altitude
#2     4yt06    N123P          C550    305      1022
#3     4yt06    N123P          C550    301      1028
#4     4xx21    N348U          GALX    350      1025
Sign up to request clarification or add additional context in comments.

4 Comments

Sangkuen, thanks for the explanation on the out-of-bounds error. Makes sense, now.
This method actually worked a bit better. Thanks, again!
@atc_ceedee happy to help u :)
@atc_ceedee If this helped solve your problem, pls mark this answer as accepted, so that this can help other users know this question is solved by this answer.
1

Try this.

tracks[tracks.groupby('track_key').transform('min')['altitude']>0]

output

    track_key   callsign    aircraft_type   speed   altitude
2   4yt06       N123P       C550            305     1022
3   4yt06       N123P       C550            301     1028
4   4xx21       N348U       GALX            350     1025

Thanks to @bkeesey for this solution.

2 Comments

Cam, that did it! Much simpler, as well. I swapped out the ">0" for "<field_elev", in my case. Some of my aircraft had ADS-B altitudes that were below the field elevation.
Yes transform is nice, see docs here: pandas.pydata.org/docs/reference/api/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.