30

I have read a csv file into a pandas dataframe and want to do some simple manipulations on the dataframe. I can not figure out how to create a new dataframe based on selected columns from my original dataframe. My attempt:

names = ['A','B','C','D']
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset['A','D']

I would like to create a new dataframe with the columns A and D from the original dataframe.

1
  • 6
    Pass a list of the cols of interest to sub-select: new_dataset = dataset[['A','D']] note that if you're intending to operate on a copy then call copy(): new_dataset = dataset[['A','D']].copy() Commented Jul 11, 2017 at 13:28

2 Answers 2

47

It is called subset - passed list of columns in []:

dataset = pandas.read_csv('file.csv', names=names)

new_dataset = dataset[['A','D']]

what is same as:

new_dataset = dataset.loc[:, ['A','D']]

If need only filtered output add parameter usecols to read_csv:

new_dataset = pandas.read_csv('file.csv', names=names, usecols=['A','D'])

EDIT:

If use only:

new_dataset = dataset[['A','D']]

and use some data manipulation, obviously get:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

If you modify values in new_dataset later you will find that the modifications do not propagate back to the original data (dataset), and that Pandas does warning.

As pointed EdChum add copy for remove warning:

new_dataset = dataset[['A','D']].copy()
Sign up to request clarification or add additional context in comments.

Comments

0

You must pass a list of column names to select columns. Otherwise, it will be interpreted as MultiIndex; df['A','D'] would work if df.columns was MultiIndex.

The most obvious way is df.loc[:, ['A', 'B']] but there are other ways (note how all of them take lists):

df1 = df.filter(items=['A', 'D'])

df1 = df.reindex(columns=['A', 'D'])

df1 = df.get(['A', 'D']).copy()

N.B. items is the first positional argument, so df.filter(['A', 'D']) also works.

Note that filter() and reindex() return a copy as well, so you don't need to worry about getting SettingWithCopyWarning later.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.