AWS DMS: Same S3 bucket as target for multiple SQL sources

Question

I cannot find information about how to achieve what's in the title of this question. Let's say I have different SQL databases from different departments in the org and I want to migrate all of them to our data lake bucket. I want to use AWS DMS to connect to A, B, C databases and make full load/CDC into data lake bucket (i.e. S3 target).

Each database most of the time has all tables under public schema. So in S3 how can I identify which files are coming from Source Database A, Source Database B and Source Database C. Is it possible to include the task identifier as metadata of the data sent from source to target?

The docs mention Multiple tasks that replicate data from the same source table to the same target S3 endpoint bucket result in those tasks writing to the same file. We recommend that you specify different target endpoints (buckets) if your data source is from the same table., but this is not my case since I am not replicating the same source to the same target, but rather multiple sources to the same target.

Ross Bush · Accepted Answer · 2024-11-08 18:46:19Z

0

When the DMS Task's Endpoint is S3, you can specify a bucket_folder, bucket_name as well as a cdc_path for ongoing replication. Simply create an endpoint for each of the source database target endpoints and name the bucket properties based on the source database.

The three properties above are static and can't be dynamicly changed based on metadata or whatnot.

Here is a related document. --> https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html

edited Nov 8, 2024 at 18:46

answered Nov 8, 2024 at 14:16

Ross Bush

15.2k2 gold badges39 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Harsha Mathan · Accepted Answer · 2024-12-26 03:01:36Z

0

I was able to resolve the issue. You can use transformation rules in the DMS task by applying an add-prefix rule with the target set to schema and the database name specified as the value. Here's the JSON configuration:

{
"rule-type": "transformation",
"rule-id": "1",
"rule-name": "Add Prefix to Schema",
"rule-action": "add-prefix",
"rule-target": "schema",
"object-locator": {
"schema-name": "dbo"
},
"value": "db1"
}

This would sort in creating the S3 path in the below format.

<bucket-name>/<database-name>/<schema-name>/<table-name>/<file-name>

edited Dec 26, 2024 at 3:01

answered Dec 25, 2024 at 16:33

Harsha Mathan

315 bronze badges

Collectives™ on Stack Overflow

AWS DMS: Same S3 bucket as target for multiple SQL sources

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related