implication of changes to env_vars and vars for state:modified #10518
Replies: 4 comments 4 replies
-
Just checking that this summary table is correct, and we're proposing to flip which variables do/don't get picked up by state:modified?
What happens if I do this? export FUTURE_DATE_ENV_VAR='2999-01-01' # dbt_project.yml
...
vars:
future_date_var: {{ env_var('FUTURE_DATE_ENV_VAR') }}
I'm guessing that the env var will still have been passed into future_date_var during compilation/parsing so it will be picked up as a modification? If so then that might also be a helpful escape hatch for anyone who is dependent on the current behaviour where env var changes are picked up as modifications. |
Beta Was this translation helpful? Give feedback.
-
Would love when a macro is modified that its usages get marked as downstream modified, is this in the horizon? |
Beta Was this translation helpful? Give feedback.
-
Just commenting here, maybe this has already been considered in some other discussion elsewhere, if so I didn't see it. Lets say we have a project variable that indicates the start of an incremental window. This is accomplished by calling a macro that returns todays date plus some (negative) offset.
When you determine what has been "modified" under the new logic will you be considering the rendered value or unrendered value? Unrendered meaning the macro call itself, rendered meaning the actual result of the macro call. Meaning, if the macro calls themselves don't change change, the models that depend on the Or will you considered the value that the macro actually produces? If you consider this rendered value output by the macro, and defer to a job that ran yesterday, any job runs today that uses the |
Beta Was this translation helpful? Give feedback.
-
Are there other features needed to address:
I know @joellabes created a tool that compares manifests, but I am wondering if there can be something on core that could summarize what was detected at a change. This case is indeed very common and users don't know the implication of doing what to them seems "reasonable" sources:
- name: jaffle_shop
database: "{{ env_var('DBT_DATABASE'}}"
tables:
- name: customers When I debug these issues I use the state selectors to at least get me closer to the cause, maybe there can be a flag we can pass to dbt ls that would indicate what was different e.g. database in the case above. |
Beta Was this translation helpful? Give feedback.
-
We currently call out a lot of caveats to state comparison in our docs.
There are many scenarios where executing
dbt list --select state:modified
over or under selects the appropriate resources. This leads to:state:modified
, I haven't changed them"While it's unlikely we can cover ever single edge case here, we want to improve
state:modified
so that it ideally actually only selects the modified resources - we started an EPIC to track that work.Investigating a few of the top issues has lead us to some specific opinions on how we think
state:modified
should work forvars
andenv_var
s. Since these changes fall into the category of “not quite bug, but not quite a feature”, we want to run them by y’all first and hear if you have any reactions / concerns / comments before we implement.Environment Variables
Current Behavior
Currently,
state:modified
includes all changes to the value of an environment variable.Let’s say you want to point to a different location for your sources depending on the environment you are in (e.g. using a different database for prod vs. dev). To do so, you configure your source to set
database
to an environment variable like so:If you then run dbt in
dev
deferring toprod
, this source will be marked as modified every time you use--select state:modified
because the value of this environment variable (DBT_DATABASE
) differs by environment.More specifically, this environment-sensitivity leading to false positives only occurs when the
env_var
configuration is coming from a schema yml or definition (.sql, .py) file. Under the hood, this is because we are comparing thedatabase
values after the jinja in schema and definition files has been rendered, so dbt cannot tell whether it’s the unrendered or rendered configurations have changed.Future State
We think this is wrong. Instead, we believe that changes to the value of an environment variable should be ignored when selecting
state:modified
:modified
(e.g. changingdatabase
from{{ env_var('DBT_DATABASE'}}
to{{ env_var('DBT_SOURCE_DATABASE'}}
)modified
(e.g. the environment variableDBT_DATABASE
compiling toraw_dev
vs.raw_prod
)modified
.Project Variables
Current Behavior
Currently,
state:modified
excludes all changes to the value of a project variable.Let’s say you’re using a project variable to set a project-default “future date” to
coalesce
yourNULL
dates to:And then you’re using that project variable in some of your models:
If you then update the value of this project variable (either in your
dbt_project.yml
or via a CLI override using—-vars
) and run dbt using--select state:modified
, this model will not be marked as modified. This is because we are not tracking which var values a given model relies on explicitly in dbt’s internal manifest.Future State
We think this is wrong. Instead, we believe that changes to the value of a project variable should be included when selecting
state:modified
:modified
(e.g. changing your logic fromcoalesce(order_date, {{ var('future_date') }})
tocoalesce(order_date, {{ var('future_timestamp') }})
)modified
(e.g. the project variablefuture_date
compiling toto_date(2999-01-01)
vs.to_date(3999-01-01)
)Note: Changes to
var
values, when defined in a schema yml file in aconfig
block, (e.g.database: "{{ var('dbt_database'}}"
) are already marked as modified today for the exact reason that dbt is overly-sensitive to environment variable value changes above — it is comparing rendered configurations to one another. We think this is correct for vars, and should continue to work this way.Why would project variables and environment variables be treated differently?
Thoughts?
We’d love to hear your thoughts, use cases, etc. to help inform our decision here!
Beta Was this translation helpful? Give feedback.
All reactions