I've got some data on S3 bucket that I want to work with.
I've imported it using:
import boto3
import dask.dataframe as dd
def import_df(key):
s3 = boto3.client('s3')
df = dd.read_csv('s3://.../' + key ,encoding='latin1')
return df
key = 'Churn/CLEANED_data/file.csv'
train = import_df(key)
I can see that the data has been imported correctly using:
train.head()
but when I try simple operation (taken from this dask doc):
train_churn = train[train['CON_CHURN_DECLARATION'] == 1]
train_churn.compute()
I've got Error:
AttributeError Traceback (most recent call last) in ()
1 train_churn = train[train['CON_CHURN_DECLARATION'] == 1]
----> 2 train_churn.compute()
~/anaconda3/envs/python3/lib/python3.6/site-packages/dask/base.py in compute(self, **kwargs) 152 dask.base.compute 153 """ --> 154 (result,) = compute(self, traverse=False, **kwargs) 155 return result 156
AttributeError: 'DataFrame' object has no attribute '_getitem_array'
Full error here: Error Upload