1

I cannot find information about how to achieve what's in the title of this question. Let's say I have different SQL databases from different departments in the org and I want to migrate all of them to our data lake bucket. I want to use AWS DMS to connect to A, B, C databases and make full load/CDC into data lake bucket (i.e. S3 target).

Each database most of the time has all tables under public schema. So in S3 how can I identify which files are coming from Source Database A, Source Database B and Source Database C. Is it possible to include the task identifier as metadata of the data sent from source to target?

The docs mention Multiple tasks that replicate data from the same source table to the same target S3 endpoint bucket result in those tasks writing to the same file. We recommend that you specify different target endpoints (buckets) if your data source is from the same table., but this is not my case since I am not replicating the same source to the same target, but rather multiple sources to the same target.

2 Answers 2

0

When the DMS Task's Endpoint is S3, you can specify a bucket_folder, bucket_name as well as a cdc_path for ongoing replication. Simply create an endpoint for each of the source database target endpoints and name the bucket properties based on the source database.

The three properties above are static and can't be dynamicly changed based on metadata or whatnot.

Here is a related document. --> https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html

Sign up to request clarification or add additional context in comments.

Comments

0

I was able to resolve the issue. You can use transformation rules in the DMS task by applying an add-prefix rule with the target set to schema and the database name specified as the value. Here's the JSON configuration:

{
"rule-type": "transformation",
"rule-id": "1",
"rule-name": "Add Prefix to Schema",
"rule-action": "add-prefix",
"rule-target": "schema",
"object-locator": {
"schema-name": "dbo"
},
"value": "db1"
}

This would sort in creating the S3 path in the below format.

<bucket-name>/<database-name>/<schema-name>/<table-name>/<file-name>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.