You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How health-check works? It's done every minute by default. Mirrorbits does a HTTP request (HEAD) on a random file served by the mirror. If the request is successful, mirror is marked as up, otherwise mirror is marked as down.
To go a bit more in depth: Mirrorbits gets the file from the hash HANDLEDFILES_<mirror-id>. This hash contains the files that are 1) on the mirror and 2) on the local repo (the source, in mirrorbits-speak). So it's the intersection between these two sets. It means that the HANDLEDFILES hash doesn't contain extra files that would be on the mirror but not in the source (if any), and doesn't contain files that are in the source but not on the mirror (if any). So it should be pretty good at picking a suitable file
The value of HANDLEDFILES_<mirror-id> is updated every time a mirror scan completes.
Issue - the theory
The issue lies with the last line. Every time a mirror is updated, HANDLEDFILES will be outdated, until the mirror is scanned. Assuming mirrors are scanned every hour, then there's a window of one hour at most during which HANDLEDFILES contain files that might not be on the mirror anymore. If the health-check picks one of those files, the mirror will return 404, and the mirror is marked as down. If the health-check picks a file that is still on the mirror, all good, the mirror is up. Assuming health-check is done every minute, then we have a one hour window during which mirrors might appear as "flaky", and go up and down every minute.
Issue - in practice
Is it really an issue? Well, depends on how many files disappear when the repo is updated, compared to the total number of file.
in the Kali Linux image repo, there are around 175 files
50% are weekly images, and we have two weeks of weekly images
so 25% of the files are for the current weekly image, and 25% are for the last weekly image
Once a week, this repo is updated, a new weekly image is added, and the old weekly image is removed. Meaning: once a week, when the repo is updated, 25% of the files in the repo disappear.
For the health-check, it means that, during a one hour window, it has 1 chance out of 4 to pick a file that is not on the mirror anymore, and to mark the mirror down.
So, once a week, during a one hour window, the mirrors seem to be flaky, and go up and down from mirrorbits point of view. We can see it with this graph that shows around 10 days of data, and that check the availability of an image in the repo. We can clearly see the two moments when the repo was updated with a new weekly image, causing mirrors to be marked up/down by mirrorbits.
Mitigation and possible improvements
The easy mitigation for a mirrorbits user is just to reduce the scan interval (eg. to 30 minutes). It work for Kali Linux images, as there are only 175 files in this repo, so scanning is quick. So it's Ok to reduce the scan interval.
I think this issue could be mitigated in mirrorbits as well, here are a few ideas:
easy to implement: add a "404 counter" and mark a mirror down only after it returned 404 X times in a row. Might want to expose this setting in the config file.
harder, and maybe not better: limit the health-check to files in a certain location of the repo, expose this setting in the config file. Implementation-wise, it's awkward as mirrorbits just picks a random file (efficient). Picking a random file within a directory might be less efficient.
limit health-check to one particular file only. Might even make more sense than picking a random file, for some users?
The text was updated successfully, but these errors were encountered:
elboulangero
changed the title
Health-Check might check for non-existing files, could be improved
Health-Check might check for non-existing files, marking mirror down by mistake. Could be improved
Nov 2, 2023
imo limiting it to a single file is good enough (similar to TraceFileLocation config wise, even the same file could be used)
Another heuristic would be to check the "newest" file each mirror has according to the last scan, assuming only old files get removed. But that would fail if files constantly get added an removed again.
Background - how does health-check work
How health-check works? It's done every minute by default. Mirrorbits does a HTTP request (HEAD) on a random file served by the mirror. If the request is successful, mirror is marked as up, otherwise mirror is marked as down.
To go a bit more in depth: Mirrorbits gets the file from the hash
HANDLEDFILES_<mirror-id>
. This hash contains the files that are 1) on the mirror and 2) on the local repo (the source, in mirrorbits-speak). So it's the intersection between these two sets. It means that theHANDLEDFILES
hash doesn't contain extra files that would be on the mirror but not in the source (if any), and doesn't contain files that are in the source but not on the mirror (if any). So it should be pretty good at picking a suitable fileThe value of
HANDLEDFILES_<mirror-id>
is updated every time a mirror scan completes.Issue - the theory
The issue lies with the last line. Every time a mirror is updated,
HANDLEDFILES
will be outdated, until the mirror is scanned. Assuming mirrors are scanned every hour, then there's a window of one hour at most during whichHANDLEDFILES
contain files that might not be on the mirror anymore. If the health-check picks one of those files, the mirror will return 404, and the mirror is marked as down. If the health-check picks a file that is still on the mirror, all good, the mirror is up. Assuming health-check is done every minute, then we have a one hour window during which mirrors might appear as "flaky", and go up and down every minute.Issue - in practice
Is it really an issue? Well, depends on how many files disappear when the repo is updated, compared to the total number of file.
Let's look at the Kali Linux images, in numbers:
To say it words:
Once a week, this repo is updated, a new weekly image is added, and the old weekly image is removed. Meaning: once a week, when the repo is updated, 25% of the files in the repo disappear.
For the health-check, it means that, during a one hour window, it has 1 chance out of 4 to pick a file that is not on the mirror anymore, and to mark the mirror down.
So, once a week, during a one hour window, the mirrors seem to be flaky, and go up and down from mirrorbits point of view. We can see it with this graph that shows around 10 days of data, and that check the availability of an image in the repo. We can clearly see the two moments when the repo was updated with a new weekly image, causing mirrors to be marked up/down by mirrorbits.
Mitigation and possible improvements
The easy mitigation for a mirrorbits user is just to reduce the scan interval (eg. to 30 minutes). It work for Kali Linux images, as there are only 175 files in this repo, so scanning is quick. So it's Ok to reduce the scan interval.
I think this issue could be mitigated in mirrorbits as well, here are a few ideas:
The text was updated successfully, but these errors were encountered: