Skip to content

Race condition in Job creation causes data loss #3639

@ayushgupta704

Description

@ayushgupta704

What happened

While reviewing Job.get_root() in api_app/models.py, I identified a serious data integrity issue related to concurrent job creation.
Since django-treebeard is not thread-safe, high-concurrency scenarios multiple Celery workers running simultaneously can create duplicate root Job entries at the same time.
This prevents a crash but silently picks one root and ignores the others.
This leads to ghost roots duplicate roots that are never used again. Any child jobs, tags, or analyzer results attached to them become disconnected, causing permanent data inconsistency.

Image

Environment

  1. OS: Linux

What did you expect to happen

Data corruption should not be hidden during a read operation. The creation of duplicate paths needs to be prevented at the database level to ensure tree integrity under high concurrency.

How to reproduce your issue

Confirmed by manually inserting duplicate root jobs to simulate concurrent worker execution.
Image

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions