1

I want to load data into an Amazon Redshift cluster using a boto3 Python script.

I want to create a script using boto3 python to do the following:

  1. Create a cluster
  2. Load data into the cluster
  3. Create a report on the performance on the cluster

I see in boto3 there are no methods available to load the data into the cluster. Maybe from a flat-file or from S3.

How can I load the data into the cluster using boto3 or any other python package?

1 Answer 1

1

1. Create an Amazon Redshift Cluster

Call the create_cluster() command.

2. Load data into the cluster

Amazon Redshift runs like a normal PostgreSQL v8.0.2 database. To run commands on the database itself (including the COPY command), you should establish a JDBC/ODBC connection to the database.

See: Connecting to an Amazon Redshift Cluster Using SQL Client Tools - Amazon Redshift

A common method is to use psycopg2:

conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("COPY...")
conn.commit()

See: Copying data from S3 to AWS redshift using python and psycopg2

3. Create a report on the performance on the cluster

There are two sources of information for performance reporting:

See: Monitoring Amazon Redshift Cluster Performance - Amazon Redshift

Sign up to request clarification or add additional context in comments.

1 Comment

Excellent. Great answer and much appreciated. Thanks again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.