Skip to content

Conversation

@DevDendrite
Copy link

Hi, I've done some research and found out that out of all current miners, 134 where harmed (567 different responses) and validator marked some of their runs as "completed" while scoring era5, even though there were remaining lead times to be scored - the cause of that is that validator incorrectly assumes that miner is offline, marking his response in the database with status = "miner_offline" and error_message = "Miner offline during scoring"

I've used this query:

SELECT 
    miner_uid,
    COUNT(*) AS occurrence_count
FROM v_weather_miner_responses
WHERE status = 'miner_offline'
GROUP BY miner_uid;

Basically, this status is inserted by _cleanup_offline_miner_from_run, which is used in era5 scoring, whenever the miner is registered and function _request_fresh_token returns None

Function _request_fresh_token is supposed to return token, full_zarr_url and manifest_content_hash, which is a result of a handshake and kerchunk request in query_single_miner function - for some reason this doesn't work from time to time and returns None - it can only happen when:

  • status_code isn't 2xx
    • miner_communication.py:322
    • miner_communication.py:465
    • miner_communication.py:504
  • handshake fails
    • miner_communication.py:432
  • connection fails/times out
    • miner_communication.py:533
    • miner_communication.py:546
  • HTTP Client failed
    • miner_communication.py:571
  • except Exception (unexpected error)
    • miner_communication.py:496
    • miner_communication.py:560

From miner logs, I'm not seeing any difference between a successful score and failed score, which makes me think that miner is incorrectly marked as offline.

Without validator logs I can't investigate this issue even further - meanwhile I can create a PR, which will add attempts to this section of code - but I'd still like you to take a closer look into what are the errors for those miners, as the fix might be more direct than just a retry block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant