-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Time To Live (TTL) to the inventory cache #605
Add a Time To Live (TTL) to the inventory cache #605
Conversation
0347c18
to
8410442
Compare
Thanks for the changes. Very nice PR description, tests and notes. Great that you brought up Point 3 because that was something I wanted to understand. Whether it is worth supporting a similar optimization for TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY-based source is certainly up for debate (++), but dropping support because this source is more transient (not configured in a file) may not be the only option.
q: do we have concerns that it is not easy to tell which dir in ~/.cache/tanzu/plugin_inventory the test_only one points to? Just some thoughts, but going back to (++), it is possible we feel it is not worth going this length to support a non-production use-case. |
8410442
to
aea61ff
Compare
Following @vuil suggestion, this latest forced push provides the TTL for the inventory cache without needing to use the configuration file. I will now work at enabling the TTL for plugin sources added through |
Marking as draft until it is ready again |
6211f12
to
4670ef9
Compare
This is ready for review. The PR description has been updated. |
4670ef9
to
e5728b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this really nice bit of optimization. Love the tests as well. LGTM.
Most users likely would not need to do anything to take advantage of the change, but for completeness, I'd suggest a few bits of bookkeeping
- Add
docs-impact
label to the PR. - Quick mentions of TANZU_CLI_PLUGIN_DB_CACHE_TTL_SECONDS and
tanzu source init
in the release notes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks for all the details as part of the description which made reviewing this change easier. 👍
e5728b8
to
419d438
Compare
Normally, the digest of the plugin inventory is checked every time the DB needs to be read. Although this is faster than downloading the DB each time, there is still between a 2.5s to 4.5s delay in checking the digest. This makes every `plugin search` and `plugin group search` command slower. It also makes some `plugin install --group` commands much slower when all plugins are available in the cache, since the installation of each plugin in the group causes a digest check, even if the plugin binary is already in the cache. This commit provides a type of Time To Live for the DB. This means that when within the TTL, the digest is not checked and the DB is considered valid. The time the digest was last checked is stored as the modification time (mtime) of the digest file. So, whenever the DB needs to be read, if the TTL has not expired since the last time the digest was verified, the DB is directly read from cache; if the TTL has expired, the digest is checked and the DB downloaded if required. On a "plugin source update" or "plugin source init" the TTL is ignored and the digest automatically checked. This is important as either of these commands usually modify the URI of the plugin discovery and therefore invalidates the DB. Note that for any discovery source added through the TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY variable, the digest is checked every time (this TTL feature does not apply); this is because there is no way to know if current cache was downloaded from the same URIs as what is currently in the variable. This is different than for "plugin source update/init" because these two commands can actively force a cache refresh but changing the TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY variable cannot do that. The value of the cache TTL is of 30 minutes. This means that it can take a CLI up to 30 minutes to notice the publication of new plugins in the central repository. If for some reason a user wants to force a refresh immediately, they can simply do `tanzu plugin source init` (or `tanzu plugin source update default -u ...` if the discovery source is not the default central repository). The TTL value can be overriden using the environment variable TANZU_CLI_PLUGIN_DB_DIGEST_TTL_SECONDS. Signed-off-by: Marc Khouzam <[email protected]>
To allow the CLI to know if a plugin inventory cache has become invalid because it represents a different URI, we now store the OCI image URI inside the main digest file. Whenever the TTL is checked to know if the digest must be checked, the URI is also checked; if the URI has changed, the digest must be checked. Signed-off-by: Marc Khouzam <[email protected]>
Signed-off-by: Marc Khouzam <[email protected]>
419d438
to
2250229
Compare
The 3 commits are kept separated just for the review until we agree on the approach.
Once we agree I will squash and update the commit message accordingly.
What this PR does / why we need it
Normally, the digest of the plugin inventory is checked every time the DB needs to be read. Although this is faster than downloading the DB each time, there is still between a 2.5s to 4.5s delay in checking the digest. This makes every
plugin search
andplugin group search
command slower. It also makes someplugin install --group
andplugin sync
commands much slower when all plugins are available in the cache, because the installation of each plugin in the group causes a digest check, even if the plugin binary is already in the cache; for example with a group of 10 plugins already the cache, the installation can take 30 seconds because of the 10 digest checks instead of 5 seconds.This PR adds a Time To Live for the DB cache. This means that when within the TTL, the digest is not checked and the DB cache is considered valid. The time the digest was last checked is stored as the modification time of the
digest.*
file of the discovery cache. Whenever the DB needs to be read, if the TTL has not expired since the last time the digest was verified, the DB is directly read from cache; if the TTL has expired, the digest is checked, the TTL reset and the DB downloaded if required.On a
plugin source update
orplugin source init
the DB cache is automatically refreshed ignoring the TTL. This provides a way to force an immediate refresh of the cache without having to wait for the TTL to expire.Note that for discovery sources added through the
TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY
we needed a way to know for what URI the cache was created. This was tricky because such test discovery sources are not stored in the configuration file (which would have allowed to map a name to a URI). To support this case, commit "Store URI in the main plugin inventory digest file" starts storing the URI directly in the cache as a single line within the maindigest.*
file. This provides a cache-centric way of knowing if a DB must be refreshed when a discovery OCI image URI changes.Finally, the default value of the digest TTL is set to 30 minutes in this PR. It can be overridden using the new variable
TANZU_CLI_PLUGIN_DB_CACHE_TTL_SECONDS
(meant for testing). A cache TTL of 30 minutes implies that by default it can take a CLI up to 30 minutes to notice the publication of new plugins in the central repository. If for some reason a user wants to force a refresh immediately, they can simply dotanzu plugin source init
ortanzu plugin source update ...
to the current URI.Which issue(s) this PR fixes
Fixes # N/A
Describe testing done for PR
Unit tests were added as well as E2E tests.
Manual tests . Most commands below are run using
time
which allows to show the commands are very fast (< 1 sec) thanks to this PR. When a command takes 2 seconds, it means the digest is being refreshed.First the speedup:
Now check correctness when updating the plugin source:
Now test TTL expiry:
Now test with TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY:
Test correctness for TANZU_CLI_ADDITIONAL_PLUGIN_DISCOVERY_IMAGES_TEST_ONLY:
Release note
Additional information
Special notes for your reviewer
Point 1
The event that causes the plugin inventory DB to change is the publication of new plugin versions or plugin group versions. It is important to realize that such publications do not actually need to be detected immediately by CLIs in the field; what is important is that once a new plugin version is available, it becomes accessible in a reasonable amount of time. In fact, publication of plugins is already subject to mirroring delays to CDNs. Furthermore, in normal operation, end-users do not perform plugin life-cycle commands very often so don't need the plugin cache to immediately be up-to-date.
With this in mind, it is acceptable that the enhancement of the PR makes it possible for a CLI to not notice new plugins for up to 30 minutes.
However, we will need to communicate this to plugin publishing teams so they understand that they may not see their plugins immediately after publishing them but that they can do a
tanzu plugin source init/update
to work around it.Point 2
I chose 30 minutes as the TTL, but it could just as much be a different value (5 minutes, 24 hours, etc). Opinions are welcomed.
Point 3
Now that
plugin search
andplugin group search
are very fast, it is a noticeable different when the TTL has expired and they take 2 seconds to check the digest. I chose to add a printout when the digest is check. This printout would be seen at most every TTL (30 minutes):If the DB has actually changed, it would look like this:
Suggestions on the wording, or if the extra printout is necessary are welcomed.