I have a set of 35 models corresponding to objects of different types in my staging layer which I union in my intermediate layer. For the sake of this post I have replaced the data source that I am using with xxx. To perform the union, I use the following jinja for loop that uses the ref function to refer to those tables.
Relevant SQL code
{% set object_types = dbt_utils.get_column_values(
table=ref("ref_include_object_types"),
column="object_type"
) %}
WITH base AS (
{%- for object_type in object_types %}
SELECT
'{{ object_type }}' AS object_type,
{{ object_type }} AS object_value
FROM {{ ref(object_type) }}
{%- if not loop.last %}
UNION
{%- endif %}
{% endfor %}
)
However, the code above results in a Compilation Error and suggests that I add -- depends_on: {{ ref(object_type_name) }} statements for all object types:
Compilation Error:
Compilation Error in model int_xxx__combine_object_instances (models\intermediate\xxx\int_xxx__combine_object_instances.sql)
dbt was unable to infer all dependencies for the model "int_xxx__combine_object_instances".
To fix this, add the following hint to the top of the model "int_xxx__combine_object_instances":
-- depends_on: {{ ref('stg_xxx__object_type_name') }}
Considered solutions:
Solution 1
I can obviously create a list of -- depends_on: statements dynamically to deal with this issue, but I would rather avoid this, since it would make the code harder to maintain and read.
Solution 2
Another solution is suggested in this issue's comment:
{% for model in object_types %} {% set depends_on = "--depends_on: {{ ref( '" ~ model ~ "' ) }}" %} {{ depends_on }} {% endfor %}
This solution did not work for me but I am unsure why. I get the exact same compilation error as before. The solution is both upvoted and downvoted on github but noone has really commented on why it's a good/bad solution or why it might not work. I assume it doesn't work because the dependencies are acquired using jinja macros and are not rendered during compilation time.
Questions
- Can someone help me understand why solution 2 would not work?
- Are there solutions to this issue other than the two solutions I have suggested above?
- Would it be possible to somehow set a dependency on a group of models? E.g. set a dependency on a schema so that all transformations within the staging schema are finished before my intermediate layer transformation starts?
{% set object_types = ['dummy_1', 'dummy_2'] %}. And then used your code only replacing list of columns to*. And it worked without a problem. Is there another part of the code that causes the issue? If so, could you update your question with it?