Skip to content

Conversation

@dcherian
Copy link
Contributor

@dcherian dcherian commented Mar 26, 2025

Closes #867

Adds both expire_snapshots and garbage_collect.

expire_snapshots is basically an independent reimplementation of the rust logic.

garbage_collect is very simple, and only deletes expires snapshots.

@dcherian dcherian marked this pull request as ready for review March 26, 2025 22:57
@dcherian dcherian force-pushed the stateful-expiry branch 2 times, most recently from f4df554 to c155d59 Compare March 26, 2025 23:01
@dcherian dcherian requested a review from paraseba March 27, 2025 19:24
@dcherian dcherian changed the title Add expire_snapshots to stateful test Add expiration to stateful test Mar 27, 2025
@dcherian dcherian changed the base branch from main to push-zzvwrvvpllyx March 27, 2025 21:14
repo.lookup_snapshot(snap).written_at + timedelta(microseconds=1)
)
with pytest.raises(ValueError):
repo.lookup_snapshot(snap)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fails too.

delete_expired_tags=True,
)
assert expired == {a, b}
assert repo.list_tags() == {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also fails.

Base automatically changed from push-zzvwrvvpllyx to main March 28, 2025 15:40
@dcherian dcherian changed the base branch from main to push-vtuwruwwlnmz March 28, 2025 16:29
Base automatically changed from push-vtuwruwwlnmz to main March 28, 2025 18:27
@dcherian dcherian force-pushed the stateful-expiry branch 3 times, most recently from 052e2c6 to 823a46a Compare March 28, 2025 22:43
@dcherian
Copy link
Contributor Author

OK I think this is finally done 🥳 .

class CommitModel:
id: str
written_at: datetime.datetime
state: dict[str, Any]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious about what the state was gonna be. Should we call this store?

tag_iter = map(operator.attrgetter("commit_id"), self.tags.values())
return itertools.chain(self.branches.values(), tag_iter)

def expire_snapshots(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this is great, but, we are re-implementing the algorithms in the tests, potentially with the same bugs we are blind to. I wonder if in the future we can define properties of the GCed or Expired repo, and try to track less in the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I landed up here because I'm doing an exact equality check for snapshot ids that were expired because only expires snaphsots that are reachable.

I guess I can search the "forest" and assert that set(IC expired IDs) < set(all possible expirable IDs)

@dcherian dcherian enabled auto-merge (squash) March 31, 2025 20:39
@dcherian dcherian merged commit e284390 into main Mar 31, 2025
7 of 8 checks passed
@dcherian dcherian deleted the stateful-expiry branch March 31, 2025 20:46
dcherian added a commit that referenced this pull request Apr 3, 2025
* main:
  Fix `Diff` python typing (#890)
  Fail when creating Storage for Tigris using s3_compatible (#889)
  Disallow tag/branch creation with non-existing snapshot (#888)
  Log errors during listing and deleting of objects (#886)
  Rust integration tests can run in more object stores. (#884)
  Update pyo3. (#885)
  Add expiration to stateful test (#868)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add repo.lookup_snapshot(snapshot_id: str) -> SnapshotInfo

3 participants