In order to deploy DLT tables I am using yaml files that define Delta Live Tables Pipeline. Here is an example configuration.
resources:
pipelines:
bronze:
name: ${var.stage_name}_bronze
clusters:
- label: default
autoscale: ${var.default_dlt_cluster.autoscale}
spark_conf: ${var.default_dlt_cluster.spark_conf}
libraries:
- notebook:
path: ${workspace.file_path}/bronze
target: ${var.schema_suffix}bronze
development: false
catalog: ${var.default_catalog}
Given the API documentation and the databricks docs I can't find a clear way to define external pypi dependencies directly in the pipeline definition. The suggested approach seems to be adding dependencies using %pip install xyz on top of the notebook, but that feels suboptimal to manage requirements, especially for ensuring consistent versions and reproducibility across environments. Am I missing a better way to manage external dependencies for DLT pipelines? If so, what is the recommended approach?