Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update metadata for commit #500

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

EloyMartinez
Copy link
Collaborator

@EloyMartinez EloyMartinez commented Jan 8, 2025

We add an endpoint that recalculates the data type count and data type size for a particular commit by iterating over the tree and modifying the cached values inside dirnodes.

Summary by CodeRabbit

  • New Features

    • Added a new metadata update functionality for repositories
    • Introduced a new server endpoint for updating metadata via POST request
    • Implemented version-specific metadata update mechanism
  • Technical Improvements

    • Enhanced Merkle tree traversal and metadata management
    • Added support for updating metadata across different repository versions

Copy link

coderabbitai bot commented Jan 8, 2025

Warning

Rate limit exceeded

@EloyMartinez has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 41 minutes and 27 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 1c5719b and 7304243.

📒 Files selected for processing (2)
  • src/lib/src/core/v0_19_0/entries.rs (2 hunks)
  • src/server/src/controllers/metadata.rs (1 hunks)

Walkthrough

The pull request introduces a new feature for updating metadata in the Oxen system. This involves adding functionality across multiple files to support metadata updates for repositories. The changes include implementing a new update_metadata function in the core library, repository entries, server controllers, and adding a corresponding POST route in the server services. The implementation focuses on traversing Merkle tree nodes and updating metadata information for different types of repository resources.

Changes

File Change Summary
src/lib/src/core/v0_19_0/entries.rs Added multiple functions for metadata update: update_metadata, traverse_and_update, process_children, and add_children_to_db
src/lib/src/repositories/entries.rs Added update_metadata function with version-specific implementation
src/server/src/controllers/metadata.rs Introduced new async update_metadata controller function
src/server/src/services/meta.rs Added POST route for metadata update endpoint

Sequence Diagram

sequenceDiagram
    participant Client
    participant Server
    participant MetadataController
    participant RepositoryService
    participant MerkleTreeProcessor

    Client->>Server: POST /meta/{resource}
    Server->>MetadataController: update_metadata
    MetadataController->>RepositoryService: get_repo
    RepositoryService-->>MetadataController: return repository
    MetadataController->>MerkleTreeProcessor: update_metadata
    MerkleTreeProcessor->>MerkleTreeProcessor: traverse_and_update
    MerkleTreeProcessor-->>MetadataController: metadata updated
    MetadataController-->>Server: HTTP 200 OK
    Server-->>Client: Metadata Update Confirmed
Loading

Poem

🐰 Metadata's dance, a Merkle tree's song
Traversing nodes where changes belong
Bytes counted, versions aligned
With rabbit-like precision refined
Update complete, the repository gleams! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
src/lib/src/core/v0_19_0/entries.rs (3)

465-465: Simplify complex return type using type aliases

The return type of traverse_and_update is complex (Result<(HashMap<String, u64>, HashMap<String, u64>), OxenError>) and can be simplified by introducing type aliases for better readability.

Apply this diff to introduce type aliases and update the function signature:

+type DataTypeCounts = HashMap<String, u64>;
+type DataTypeSizes = HashMap<String, u64>;

 fn traverse_and_update(
     repo: &LocalRepository,
     node: &mut MerkleTreeNode,
     num_bytes: &mut u64,
-) -> Result<(HashMap<String, u64>, HashMap<String, u64>), OxenError> {
+) -> Result<(DataTypeCounts, DataTypeSizes), OxenError> {
🧰 Tools
🪛 GitHub Check: Clippy

[failure] 465-465:
very complex type used. Consider factoring parts into type definitions


522-522: Remove unnecessary casting to the same type

Casting file_node.num_bytes to u64 is unnecessary since it is already of type u64.

Apply this diff to remove the unnecessary cast:

-                .or_insert(0) += file_node.num_bytes as u64;
+                .or_insert(0) += file_node.num_bytes;
🧰 Tools
🪛 GitHub Check: Clippy

[failure] 522-522:
casting to the same type is unnecessary (u64 -> u64)


537-537: Use slices instead of &mut Vec for function parameters

Using &mut [MerkleTreeNode] instead of &mut Vec<MerkleTreeNode> makes the function more flexible and avoids unnecessary coupling to a specific collection type.

Apply this diff to update the function signature:

 fn process_children(
     repo: &LocalRepository,
-    children: &mut Vec<MerkleTreeNode>,
+    children: &mut [MerkleTreeNode],
     local_counts: &mut HashMap<String, u64>,
     local_sizes: &mut HashMap<String, u64>,
     num_bytes: &mut u64,
 ) -> Result<(), OxenError> {

Ensure to update any calls to process_children accordingly.

🧰 Tools
🪛 GitHub Check: Clippy

[failure] 537-537:
writing &mut Vec instead of &mut [_] involves a new object where a slice will do

src/lib/src/repositories/entries.rs (1)

97-97: Add documentation for the new public function.

The function lacks documentation comments explaining its purpose, parameters, and return value.

Add documentation like this:

+/// Updates the metadata (data type count and size) for a specific commit.
+/// 
+/// # Arguments
+/// 
+/// * `repo` - The local repository reference
+/// * `revision` - The commit revision to update metadata for
+/// 
+/// # Returns
+/// 
+/// * `Ok(())` if the metadata was successfully updated
+/// * `Err(OxenError)` if an error occurred or if the operation is not supported
 pub fn update_metadata(repo: &LocalRepository, revision: impl AsRef<str>) -> Result<(), OxenError> {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3433a80 and 1c5719b.

📒 Files selected for processing (4)
  • src/lib/src/core/v0_19_0/entries.rs (2 hunks)
  • src/lib/src/repositories/entries.rs (1 hunks)
  • src/server/src/controllers/metadata.rs (1 hunks)
  • src/server/src/services/meta.rs (1 hunks)
🧰 Additional context used
🪛 GitHub Check: Clippy
src/lib/src/core/v0_19_0/entries.rs

[failure] 465-465:
very complex type used. Consider factoring parts into type definitions


[failure] 522-522:
casting to the same type is unnecessary (u64 -> u64)


[failure] 537-537:
writing &mut Vec instead of &mut [_] involves a new object where a slice will do

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Test Suite Windows
  • GitHub Check: Test Suite MacOS
🔇 Additional comments (2)
src/server/src/services/meta.rs (1)

13-16: Addition of POST route for metadata update looks good

The new POST route for /{resource:.*} correctly maps to controllers::metadata::update_metadata and complements the existing GET routes without conflicts.

src/lib/src/repositories/entries.rs (1)

97-104: Add unit tests for the new update_metadata function.

The new function lacks test coverage. Consider adding tests for:

  • Successful metadata update for v0.19.0
  • Error handling for v0.10.0
  • Invalid revision handling

Would you like me to help generate the test cases for this new function?

Comment on lines 228 to 238
pub async fn update_metadata(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> {
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
let resource = parse_resource(&req, &repo)?;

repositories::entries::update_metadata(&repo, resource.version.to_str().unwrap_or_default())?;

Ok(HttpResponse::Ok().json(StatusMessage::resource_updated()))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential missing resource version to prevent errors

In line 235, resource.version.to_str().unwrap_or_default() may return an empty string if resource.version is None. This could cause update_metadata to operate on an undefined revision, leading to unexpected behavior or errors. Consider validating that resource.version is present before proceeding.

Apply this diff to handle missing resource version:

         let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
         let resource = parse_resource(&req, &repo)?;

+        let version_str = resource
+            .version
+            .to_str()
+            .ok_or(OxenHttpError::BadRequest("Missing resource version".into()))?;

-        repositories::entries::update_metadata(&repo, resource.version.to_str().unwrap_or_default())?;
+        repositories::entries::update_metadata(&repo, version_str)?;

This ensures that a meaningful error is returned if the version is missing, preventing potential issues.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn update_metadata(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> {
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
let resource = parse_resource(&req, &repo)?;
repositories::entries::update_metadata(&repo, resource.version.to_str().unwrap_or_default())?;
Ok(HttpResponse::Ok().json(StatusMessage::resource_updated()))
}
pub async fn update_metadata(req: HttpRequest) -> actix_web::Result<HttpResponse, OxenHttpError> {
let app_data = app_data(&req)?;
let namespace = path_param(&req, "namespace")?;
let repo_name = path_param(&req, "repo_name")?;
let repo = get_repo(&app_data.path, &namespace, &repo_name)?;
let resource = parse_resource(&req, &repo)?;
let version_str = resource
.version
.to_str()
.ok_or(OxenHttpError::BadRequest("Missing resource version".into()))?;
repositories::entries::update_metadata(&repo, version_str)?;
Ok(HttpResponse::Ok().json(StatusMessage::resource_updated()))
}

Comment on lines +97 to +104
pub fn update_metadata(repo: &LocalRepository, revision: impl AsRef<str>) -> Result<(), OxenError> {
match repo.min_version() {
MinOxenVersion::V0_10_0 => {
panic!("update_metadata not implemented for oxen v0.10.0")
}
MinOxenVersion::V0_19_0 => core::v0_19_0::entries::update_metadata(repo, revision),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace panic with proper error handling for v0.10.0.

The function panics for v0.10.0 which is not a good practice for a public API. Consider returning an error instead.

Apply this diff to improve error handling:

 pub fn update_metadata(repo: &LocalRepository, revision: impl AsRef<str>) -> Result<(), OxenError> {
     match repo.min_version() {
         MinOxenVersion::V0_10_0 => {
-            panic!("update_metadata not implemented for oxen v0.10.0")
+            Err(OxenError::NotImplemented(
+                "update_metadata not implemented for oxen v0.10.0".to_string()
+            ))
         }
         MinOxenVersion::V0_19_0 => core::v0_19_0::entries::update_metadata(repo, revision),
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn update_metadata(repo: &LocalRepository, revision: impl AsRef<str>) -> Result<(), OxenError> {
match repo.min_version() {
MinOxenVersion::V0_10_0 => {
panic!("update_metadata not implemented for oxen v0.10.0")
}
MinOxenVersion::V0_19_0 => core::v0_19_0::entries::update_metadata(repo, revision),
}
}
pub fn update_metadata(repo: &LocalRepository, revision: impl AsRef<str>) -> Result<(), OxenError> {
match repo.min_version() {
MinOxenVersion::V0_10_0 => {
Err(OxenError::NotImplemented(
"update_metadata not implemented for oxen v0.10.0".to_string()
))
}
MinOxenVersion::V0_19_0 => core::v0_19_0::entries::update_metadata(repo, revision),
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant