what is the fast way to drop columns in pandas dataframe from a list of column names [duplicate]

Question

I'm trying to figure out the fastest way to drop columns in df using a list of column names. this is a fancy feature reduction technique. This is what I am using now, and it is taking forever. Any suggestions are highly appreciated.

    important2=(important[:-(len(important)-500)]) 
    for i in important:
        if i in important2:
            pass
        else:
            df_reduced.drop(i, axis=1, inplace=True)
    df_reduced.head()

@David - can you give us the context for that test? I just tried to replicate it with 100 columns and 100,000 rows and drop(), del(), and a list (df = df[my_list]) were all equally performant. — elPastor
– elPastor, Commented Mar 27, 2024 at 19:35

ℕʘʘḆḽḘ · Accepted Answer · 2018-06-29 15:05:21Z

19

use a list containing the columns to be dropped:

good_bye_list = ['column_1', 'column_2', 'column_3']
df_reduced.drop(good_bye_list, axis=1, inplace=True)

edited Jun 29, 2018 at 15:05

answered Nov 15, 2016 at 2:51

ℕʘʘḆḽḘ

19.5k39 gold badges148 silver badges259 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lucas H Over a year ago

This is definitely the "best" way to do it; however, any idea why it would take a long time to run. I have a large dataframe (2 million observations, 98 columns) but still...this should be very fast? Unless I'm missing something. It took me 1min+ to delete two columns.

Marc Maxmeister Over a year ago

why use a list when .drop provides this functionality? df_reduced.drop(columns=['column_1', 'column_2', 'column_3'], inplace=True) that's more pythonic/readable anyway

Collectives™ on Stack Overflow

what is the fast way to drop columns in pandas dataframe from a list of column names [duplicate]

1 Answer 1

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Linked

Related