Skip to content

S3 bucket is out of sync with Hydra state #1545

@brianmcgee

Description

@brianmcgee

Describe the bug

Over the weekend, I pulled the inventory for the S3 bucket as of 2025-12-05T01-00Z and compared it with a dump of the buildstepoutputs table as of 2025-12-05-17:38:30Z. For every narinfo hash I could find in the inventory, I checked for a corresponding hash in the buildstepoutputs table.

I found about ~1.6 million 'orphans', that is, narinfos in the S3 bucket without a corresponding entry in buildstepoutputs:

  • 99.5% of all narinfos within the S3 bucket have a corresponding entry in buildstepoutputs.
  • The orphans account for 0.5% spread out over the last 10 years.

Using the last_modified_at column from the inventory lets us plot this over time.

Image Image

Here is a daily breakdown for the last month.

Image

A CSV of the orphan entries can be downloaded here

Expected behavior

Since Hydra is the only process uploading artefacts into the S3 bucket, I would have expected to see a 1:1 match between narinfos in the S3 bucket and build outputs in the buildstepoutput table.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions