Skip to content

Conversation

@mpiannucci
Copy link
Contributor

No description provided.

@mpiannucci mpiannucci requested a review from dcherian March 25, 2025 15:11
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK, but I'm not very confident around locks.

virtual_chunk_credentials: HashMap<ContainerName, Credentials>,
#[serde(skip)]
default_commit_metadata: Arc<RwLock<Option<SnapshotProperties>>>,
default_commit_metadata: Arc<Mutex<Option<SnapshotProperties>>>,
Copy link
Contributor

@dcherian dcherian Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Mutex over RwLock here? I guess in general we don't expect multiple readers here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly. And here we can't use the async versions because of serialization. I can use rwlock if @paraseba would prefer tho

virtual_chunk_credentials: HashMap<ContainerName, Credentials>,
#[serde(skip)]
default_commit_metadata: Arc<RwLock<Option<SnapshotProperties>>>,
default_commit_metadata: Arc<Mutex<Option<SnapshotProperties>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need locking? Can't the set_default_commit_metadata just take a &mut self? I think that would be the cleanest option, then PyRepository is the one that needs locking to call this method on the repo, which seems also right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we dont currently lock PyRepository at all and its a big change to introduce it. We use internal mutability everywhere

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what i'm proposing is maybe a smaller change? Repository today doesn't have internal mutability (which I like), it can easily be used from Rust in "the usual" way. But things like PySession hold a lock to the underlying Rust datastructure. What I'm proposing is we do the same for PyRepository, bringing these two to the same style:

pub struct PyRepository(Arc<Repository>);
pub struct PySession(pub Arc<RwLock<Session>>);

I may be missing something that would make this hard...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry thats what i meant. I was trying to avoid having to lock the Repository but i can do if we want it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge and I can give it a try tomorrow if you are busy. But, i'd prefer if we don't introduce internal mutability into the Repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can do it in this pr then! Thanks for the feedback

@mpiannucci
Copy link
Contributor Author

Need to do python next but have to switch off

@mpiannucci
Copy link
Contributor Author

Ok @paraseba let me know what you think

@mpiannucci mpiannucci requested a review from paraseba March 27, 2025 14:03
Copy link
Collaborator

@paraseba paraseba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @mpiannucci !! Only minor comments.

let lock = self.0.read().await;
(
lock.resolve_version(&version).await?,
Arc::clone(lock.asset_manager()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could do the Arc::clone once the lock is released.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How? We need the lock to get the reference to the asset manager right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lock -> get the &Arc<AssetManager> -> unlock -> Clone

Copy link
Contributor Author

@mpiannucci mpiannucci Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I didn't know you could keep a reference after unlocking

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't really think about that... I'm not sure now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i have to clone from a scope where the lock is held. Which i handled by making an inner scope so it is released immediately after cloning the asset manager

@mpiannucci
Copy link
Contributor Author

Scoped down the locks where it seemed possible/appropriate. Hope it makes sense!

@mpiannucci mpiannucci enabled auto-merge (squash) March 28, 2025 14:31
@mpiannucci mpiannucci merged commit 52289bd into main Mar 28, 2025
7 of 8 checks passed
@mpiannucci mpiannucci deleted the matt/fix-repo-seralization branch March 28, 2025 14:38
dcherian added a commit that referenced this pull request Mar 31, 2025
* main: (29 commits)
  Release version 0.2.11 (#879)
  Release version v0.2.10 (#877)
  One more GC bugfix (#878)
  Remove cache entries during GC (#875)
  Add lookup_snapshot (#876)
  Add logging to GC and expiration (#874)
  Fix ref delete during ref expiration (#873)
  Bump the rust-dependencies group across 1 directory with 3 updates (#872)
  `expire_ref` can now edit snapshot pointed by refs (#870)
  Fix repo serialization with default commit metadata (#863)
  Fix bug in expiration that creates a commit loop (#869)
  Uncomment `delete_branch` in stateful repo ops test (#866)
  Add upstream dev CI (#862)
  Add optional default commit metadata to `Repository` (#860)
  Update GC docstrings (#858)
  Add expiration/GC notebook (#857)
  Small docs polish (#856)
  Release version 0.2.9 (#855)
  Add support for virtual chunks in GCS (#853)
  Cargo deny is smarter in new Rust version (#854)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants