You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An original idea was for historical API docs to use the asset from latest if the blob is bit-for-bit identical. However, there is an edge case there if a new version of latest removes the blob, then all the historical docs are pointing to an asset that no longer exists.
Instead, we could use a folder public/images/api/qiskit/common. If a blob appears in >1 version, we store the blob in /common.
There is a risk that the same blob filename has multiple versions over time, e.g. version A is in Qiskit 0.19-1.1, then version B is in Qiskit 1.2-1.3+. So, we should probably put something in the file name as a suffix, like the number of bytes or a hash.
Be careful that the algorithm doesn't slow down gen-api too much. To determine whether an image has a duplicate, we need to inspect every other API version, including the new version we're currently generating
Ideally we can do the de-duplication as part of gen-api, rather than a standalone process we sometimes manually run to post-process. With Git repo size, we need to avoid introducing the binary at all because once a blob is saved to Git, it is there forever unless we force push.
If we set up a new de-duplication, we need to remember to rewrite the link in the historical API version that now has a common blob
The text was updated successfully, but these errors were encountered:
Another approach to the folders is to have newer versions reference images in older versions of Qiskit. E.g., if we add v2.0 and an image is unchanged from v1.4, then v2.0 just points to the v1.4 image, otherwise it adds the new image to the v2.0 folder.
Another approach to the folders is to have newer versions reference images in older versions of Qiskit. E.g., if we add v2.0 and an image is unchanged from v1.4, then v2.0 just points to the v1.4 image, otherwise it adds the new image to the v2.0 folder.
That's a really good idea because we wouldn't expect images to change in historical API versions. Great suggestion!
When implementing, I suspect Frank's suggestion will be simplest. However, we should evaluate both options and use whatever is the simplest/most maintainable.
Update Jan 10, 2025: research if this actually does help, per Jake's suggestion in #2533 (comment)
Our Git repo size is very large. This is mostly from blobs, like our images and videos.
It appears some of the images are identical, such as https://github.com/Qiskit/documentation/blob/main/public/images/api/qiskit/depth.gif. So, it's very inefficient for us to duplicate the same image ~25 times.
An original idea was for historical API docs to use the asset from latest if the blob is bit-for-bit identical. However, there is an edge case there if a new version of latest removes the blob, then all the historical docs are pointing to an asset that no longer exists.
Instead, we could use a folder
public/images/api/qiskit/common
. If a blob appears in >1 version, we store the blob in/common
.gen-api
too much. To determine whether an image has a duplicate, we need to inspect every other API version, including the new version we're currently generatinggen-api
, rather than a standalone process we sometimes manually run to post-process. With Git repo size, we need to avoid introducing the binary at all because once a blob is saved to Git, it is there forever unless we force push.The text was updated successfully, but these errors were encountered: