Remove Unnamed columns in pandas dataframe [duplicate]

Question

I have a data file from columns A-G like below but when I am reading it with pd.read_csv('data.csv') it prints an extra unnamed column at the end for no reason.

colA    ColB    colC    colD    colE    colF    colG    Unnamed: 7
44      45      26      26      40      26      46        NaN
47      16      38      47      48      22      37        NaN
19      28      36      18      40      18      46        NaN
50      14      12      33      12      44      23        NaN
39      47      16      42      33      48      38        NaN

I have seen my data file various times but I have no extra data in any other column. How I should remove this extra column while reading ? Thanks

Your first column is probably the index col see related: stackoverflow.com/questions/36519086/… — EdChum
– EdChum, Commented May 15, 2017 at 15:43
I just had the same issue. I examined my data file.. and found that there was an extra separator at the end of the header row (row 0). — UnadulteratedImagination
– UnadulteratedImagination, Commented Sep 2, 2021 at 17:56

MaxU - stand with Ukraine · Accepted Answer · 2022-09-16 09:41:35Z

347

df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

In [162]: df
Out[162]:
   colA  ColB  colC  colD  colE  colF  colG
0    44    45    26    26    40    26    46
1    47    16    38    47    48    22    37
2    19    28    36    18    40    18    46
3    50    14    12    33    12    44    23
4    39    47    16    42    33    48    38

NOTE: very often there is only one unnamed column Unnamed: 0, which is the first column in the CSV file. This is the result of the following steps:

a DataFrame is saved into a CSV file using parameter index=True, which is the default behaviour
we read this CSV file into a DataFrame using pd.read_csv() without explicitly specifying index_col=0 (default: index_col=None)

The easiest way to get rid of this column is to specify the parameter pd.read_csv(..., index_col=0):

df = pd.read_csv('data.csv', index_col=0)

edited Sep 16, 2022 at 9:41

answered May 15, 2017 at 15:42

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

vcnr_1234 Over a year ago

For some reason, the above response was unsuccessful in my case, but this link resolved it for me. Using match instead of contains df2.loc[:,~df2.columns.str.match("Unnamed")]

Dave Over a year ago

The index_col=0 comment is very useful - most occurrences of Unnamed columns are for "Unnamed: 0", which is likely to be an index. Although not the case in this question.

Laughing Vergil · Accepted Answer · 2019-04-17 18:09:05Z

41

First, find the columns that have 'unnamed', then drop those columns. Note: You should Add inplace = True to the .drop parameters as well.

df.drop(df.columns[df.columns.str.contains('unnamed',case = False)],axis = 1, inplace = True)

edited Apr 17, 2019 at 18:09

Laughing Vergil

3,7661 gold badge17 silver badges28 bronze badges

answered Mar 29, 2018 at 11:19

Adil Warsi

5614 silver badges6 bronze badges

Comments

Gaurang Tandon · Accepted Answer · 2019-05-10 05:42:32Z

15

The pandas.DataFrame.dropna function removes missing values (e.g. NaN, NaT).

For example the following code would remove any columns from your dataframe, where all of the elements of that column are missing.

df.dropna(how='all', axis='columns')

edited May 10, 2019 at 5:42

Gaurang Tandon

6,88911 gold badges51 silver badges96 bronze badges

answered Oct 8, 2018 at 6:40

Susan

2032 silver badges7 bronze badges

1 Comment

sɐunıɔןɐqɐp Over a year ago

From Review: Welcome to Stack Overflow! Try to provide a nice description about how your solution works. See: How do I write a good answer?. Thanks

Ezarate11 · Accepted Answer · 2018-11-06 12:56:15Z

8

The approved solution doesn't work in my case, so my solution is the following one:

    ''' The column name in the example case is "Unnamed: 7"
 but it works with any other name ("Unnamed: 0" for example). '''

        df.rename({"Unnamed: 7":"a"}, axis="columns", inplace=True)

        # Then, drop the column as usual.

        df.drop(["a"], axis=1, inplace=True)

Hope it helps others.

answered Nov 6, 2018 at 12:56

Ezarate11

4496 silver badges13 bronze badges

Collectives™ on Stack Overflow

Remove Unnamed columns in pandas dataframe [duplicate]

4 Answers 4

2 Comments

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

1 Comment

Comments

Linked

Related