You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've encountered an issue where task duration keeps increasing for tasks that are skipped due to DAG timeout:
Problem Description:
When a DAG run times out due to dagrun_timeout, the unfinished tasks are marked as SKIPPED. However, in the Web UI, the task duration continues to increase even though the task state is SKIPPED :
Root Cause Analysis:
From code inspection in airflow/jobs/scheduler_job.py, the _schedule_dag_run method handles DAG timeout by:
if (
dag_run.start_dateanddag.dagrun_timeoutanddag_run.start_date<timezone.utcnow() -dag.dagrun_timeout
):
dag_run.set_state(DagRunState.FAILED)
unfinished_task_instances=session.scalars(
select(TI)
.where(TI.dag_id==dag_run.dag_id)
.where(TI.run_id==dag_run.run_id)
.where(TI.state.in_(State.unfinished))
)
fortask_instanceinunfinished_task_instances:
task_instance.state=TaskInstanceState.SKIPPEDsession.merge(task_instance)
session.flush()
self.log.info("Run %s of %s has timed-out", dag_run.run_id, dag_run.dag_id)
Setting DAG run state to FAILED
Marking unfinished tasks as SKIPPED
But it doesn't set end_date for these task instances
I'm wondering if this is a known issue or if there's a recommended way to handle this.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Airflow community,
I've encountered an issue where task duration keeps increasing for tasks that are skipped due to DAG timeout:
Problem Description:
When a DAG run times out due to
dagrun_timeout
, the unfinished tasks are marked asSKIPPED
. However, in the Web UI, the task duration continues to increase even though the task state isSKIPPED
:Root Cause Analysis:
From code inspection in
airflow/jobs/scheduler_job.py
, the_schedule_dag_run
method handles DAG timeout by:end_date
for these task instancesI'm wondering if this is a known issue or if there's a recommended way to handle this.
Thanks,
Chris Zhao
Beta Was this translation helpful? Give feedback.
All reactions