10

I have Minio server hosted locally. I need to read file from minio s3 bucket using pandas using S3 URL like "s3://dataset/wine-quality.csv" in Jupyter notebook.

I tried using s3 boto3 library am able to download file.

import boto3
s3 = boto3.resource('s3',
                endpoint_url='localhost:9000',
                aws_access_key_id='id',
                aws_secret_access_key='password')
s3.Bucket('dataset').download_file('wine-quality.csv', '/tmp/wine-quality.csv')

But when I try using pandas,

data = pd.read_csv("s3://dataset/wine-quality.csv")

I'm getting client Error, Forbidden 403. I know that pandas internally use boto3 library(correct me if am wrong)

PS: Pandas read_csv has one more param, " storage_options={ "key": AWS_ACCESS_KEY_ID, "secret": AWS_SECRET_ACCESS_KEY, "token": AWS_SESSION_TOKEN, }". But I couldn't find any configuration for passing custom Minio host URL for pandas to read.

1 Answer 1

12

Pandas v1.2 onwards allows you to pass storage options which gets passed down to fsspec, see the docs here: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html?highlight=s3fs#reading-writing-remote-files.

To pass in a custom url, you need to specify it through client_kwargs in storage_options:

df = pd.read_csv(
    "s3://dataset/wine-quality.csv",
    storage_options={
        "key": AWS_ACCESS_KEY_ID,
        "secret": AWS_SECRET_ACCESS_KEY,
        "token": AWS_SESSION_TOKEN,
        "client_kwargs": {"endpoint_url": "localhost:9000"}
    }
)
Sign up to request clarification or add additional context in comments.

3 Comments

It needs s3fs library to be installed.
Yep! Can be installed with pip install s3fs
If you get ValueError: Invalid endpoint: localhost:9000/ error then try setting, http://localhost:9000 for endpoint_url

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.