Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching functionality for a function in TIP (SBO) #1859

Open
Konohana0608 opened this issue Mar 10, 2025 · 0 comments
Open

Caching functionality for a function in TIP (SBO) #1859

Konohana0608 opened this issue Mar 10, 2025 · 0 comments
Labels
PO issue Created by Product owners

Comments

@Konohana0608
Copy link
Contributor

Konohana0608 commented Mar 10, 2025

User Story
As a TIP user, I want to utilize a shared cache for function results to significantly reduce execution time. The cache should be accessible across all users and all deployments within our TIP user base.

Objective
The goal is to implement a shared caching mechanism for a computationally expensive function.

  • Each computed dataset sample (i.e., an output generated by the function) is approximately 500 KB in size.
  • The system will handle several thousand computed samples.
  • Samples are uniquely indexed using model name, target tissue, and other key parameters.
  • When a user computes a dataset sample for the first time, it should be stored in the shared cache, allowing future users to retrieve it instantly instead of recomputing it.

Expected Benefit
The most time-consuming process in our surrogate-based optimization workflow is evaluating training dataset samples to build the model. By caching these computed samples:

  • The first user to compute a specific sample will store it in the cache.
  • Subsequent users needing the same sample can retrieve it instantly, avoiding redundant computations.
    This will lead to significant time savings across all users.

Technical Considerations

  • The cache should be shared globally across all users and deployments.
  • The caching mechanism should ensure data integrity and uniqueness based on the defined indexing criteria.
  • The system should handle thousands of cached computed samples (relatively) efficiently.

Would love to hear thoughts on potential caching solutions and best practices for implementation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PO issue Created by Product owners
Projects
None yet
Development

No branches or pull requests

10 participants