DBT - Forcing dependencies dynamically using dbt_utils.get_column_values()

Question

I have a set of 35 models corresponding to objects of different types in my staging layer which I union in my intermediate layer. For the sake of this post I have replaced the data source that I am using with xxx. To perform the union, I use the following jinja for loop that uses the ref function to refer to those tables.

Relevant SQL code

{% set object_types = dbt_utils.get_column_values(
  table=ref("ref_include_object_types"),
  column="object_type"
) %}

WITH base AS (
  {%- for object_type in object_types %}
    SELECT
      '{{ object_type }}' AS object_type,
      {{ object_type }}   AS object_value
    FROM {{ ref(object_type) }}
    {%- if not loop.last %}
      UNION
    {%- endif %}
  {% endfor %}
)

However, the code above results in a Compilation Error and suggests that I add -- depends_on: {{ ref(object_type_name) }} statements for all object types:

Compilation Error:

Compilation Error in model int_xxx__combine_object_instances (models\intermediate\xxx\int_xxx__combine_object_instances.sql)

dbt was unable to infer all dependencies for the model "int_xxx__combine_object_instances".

To fix this, add the following hint to the top of the model "int_xxx__combine_object_instances":

  -- depends_on: {{ ref('stg_xxx__object_type_name') }}

Considered solutions:

Solution 1

I can obviously create a list of -- depends_on: statements dynamically to deal with this issue, but I would rather avoid this, since it would make the code harder to maintain and read.

Solution 2

Another solution is suggested in this issue's comment:

{% for model in object_types %}
{% set depends_on = "--depends_on: {{  ref( '" ~  model ~ "' )  }}" %}
{{  depends_on  }}
{% endfor %}

This solution did not work for me but I am unsure why. I get the exact same compilation error as before. The solution is both upvoted and downvoted on github but noone has really commented on why it's a good/bad solution or why it might not work. I assume it doesn't work because the dependencies are acquired using jinja macros and are not rendered during compilation time.

Questions

Can someone help me understand why solution 2 would not work?
Are there solutions to this issue other than the two solutions I have suggested above?
Would it be possible to somehow set a dependency on a group of models? E.g. set a dependency on a schema so that all transformations within the staging schema are finished before my intermediate layer transformation starts?

Could you help me reproduce the issue? I created two dummy models. Put their names into a variable:{% set object_types = ['dummy_1', 'dummy_2'] %}. And then used your code only replacing list of columns to *. And it worked without a problem. Is there another part of the code that causes the issue? If so, could you update your question with it? — Kliment Merzlyakov
– Kliment Merzlyakov, Commented Jul 16, 2024 at 17:12
@KlimentMerzlyakov I added how the object_types are calculated, in case it's relevant. It looks like hardcoding the object types works, but using the dbt_utils function to extract them from a column causes the compilation error. I guess that's because the dbt_utils function is a macro. — MattSt
– MattSt, Commented Jul 17, 2024 at 11:57

MattSt · Accepted Answer · 2024-07-24 12:36:40Z

0

The issue stems from the implementation of the dbt_utils.get_column_values function used to acquire the object_types.

{% set object_types = dbt_utils.get_column_values(
  table=ref("ref_include_object_types"),
  column="object_type"
) %}

Checking the source code one can find the following lines:

{# Prevent querying of db in parsing mode. This works because this macro does not create any new refs. #}
{%- if not execute -%}
    {% set default = [] if not default %}
    {{ return(default) }}
{% endif %}

Apparently, the part of the function returning the column values is not called during compilation because they don't want to query the database in parsing mode and create more references.

I dealt with the issue by replacing my seed table with project variables and creating macros for parsing them.

edited Jul 24, 2024 at 12:36

answered Jul 22, 2024 at 14:47

MattSt

1,2032 gold badges18 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

LZhavoronkov · Accepted Answer · 2025-10-24 12:35:05Z

I've finally found a solution to set the dependencies dynamically.

Refer to this macros: https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/get_relations_by_pattern.sql

Basically, this lets you set the deps manually during template parsing, which avoids the requirement to add the forced dependency comment (--depends on: {{ ref('table_name') }}).

{% set relations = dbt_utils.get_relations_by_pattern('%dataset_name_pattern%', '%table name patter%') -%}
{% for relation in relations -%}
select
    *
from
    {{ relation }}
{% endfor -%}

If you ever need to add some conditions based on the table ID, you can approach it this way (may be not optimal):

{% set relations = dbt_utils.get_relations_by_pattern('%dataset_name_pattern%', '%table name patter%') -%}
{% for relation in relations -%}
select
    {% if relation.path.identifier == 'required_table' -%}
    col1 as id
    {% elif -%}
    col2 as id
    {% endif -%}
from
    {{ relation }}
{% endfor -%}

The relation.path.identifier returns string representation of full table path ID (e.g. project_id.dataset_id.table_id).

Hopefully, this helps!

Collectives™ on Stack Overflow

DBT - Forcing dependencies dynamically using dbt_utils.get_column_values()

Relevant SQL code

Compilation Error:

Considered solutions:

Solution 1

Solution 2

Questions

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Relevant SQL code

Compilation Error:

Considered solutions:

Solution 1

Solution 2

Questions

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related