-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow version
Other Airflow 3 version (please specify below)
If "Other Airflow 3 version" selected, which one?
3.1.2
What happened?
On upgrade to Airflow 3.1.2, there was a huge increase in the number of DAG versions for many of the DAGs. I believe this may have been fixed in Airflow 3.1.5 (#59091).
Before upgrading to get the fix, I wanted to clean up the database airflow db clean. This was to clean up around 360K rows. The non batch version ran for 2+ hours before I killed it. Then I tried
airflow db clean -t dag_version --clean-before-timestamp 2025-12-20 --batch-size 1000
This was also extremely slow. Around 6 minutes per batch.
The fix for me was to add (temporary) indices to the tables task_instance and dag_run
CREATE INDEX CONCURRENTLY idx_temp_task_instance_dag_version_id
ON task_instance(dag_version_id);
CREATE INDEX CONCURRENTLY idx_temp_dag_run_dag_version_id
ON dag_run(created_dag_version_id);
The airflow db clean then ran extremely fast - under 2 minutes.
What you think should happen instead?
airflow db clean should complete relatively quickly. Adding indices to the tables task_instance and dag_run on the DAG version id led to a huge speed up so these indices should likely exist.
How to reproduce
Create a large number of DAG versions and then attempt to drop them
Operating System
Debian GNU/Linux 12 (bookworm)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==9.16.0
apache-airflow-providers-celery==3.13.0
apache-airflow-providers-cncf-kubernetes==10.9.0
apache-airflow-providers-common-compat==1.8.0
apache-airflow-providers-common-io==1.6.4
apache-airflow-providers-common-messaging==2.0.0
apache-airflow-providers-common-sql==1.28.2
apache-airflow-providers-docker==4.4.4
apache-airflow-providers-elasticsearch==6.3.4
apache-airflow-providers-fab==3.0.1
apache-airflow-providers-ftp==3.13.2
apache-airflow-providers-git==0.0.9
apache-airflow-providers-google==18.1.0
apache-airflow-providers-grpc==3.8.2
apache-airflow-providers-hashicorp==4.3.3
apache-airflow-providers-http==5.4.0
apache-airflow-providers-microsoft-azure==12.8.0
apache-airflow-providers-mysql==6.3.4
apache-airflow-providers-odbc==4.10.2
apache-airflow-providers-openlineage==2.7.3
apache-airflow-providers-postgres==6.4.0
apache-airflow-providers-redis==4.3.2
apache-airflow-providers-sendgrid==4.1.4
apache-airflow-providers-sftp==5.4.1
apache-airflow-providers-slack==9.4.0
apache-airflow-providers-smtp==2.3.1
apache-airflow-providers-snowflake==6.6.0
apache-airflow-providers-ssh==4.1.5
apache-airflow-providers-standard==1.9.1
Deployment
Official Apache Airflow Helm Chart
Deployment details
helm on AWS EKS
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct