Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More debug messages when validating internal data structures. #2088

Closed
wants to merge 1 commit into from

Conversation

vdbergh
Copy link
Contributor

@vdbergh vdbergh commented Jun 30, 2024

This is an attempt to get more information about the mysterious bug

https://tests.stockfishchess.org/actions?action=log_message&user=&text=internal

which seems difficult to reproduce and logically impossible to explain.

This is an attempt to get more information about the
mysterious bug

https://tests.stockfishchess.org/actions?action=log_message&user=&text=internal

which seems difficult to reproduce and logically impossible
to explain.
@ppigazzini ppigazzini added bug server server side changes labels Jun 30, 2024
@ppigazzini
Copy link
Collaborator

PROD running with the PR.

@vdbergh
Copy link
Contributor Author

vdbergh commented Jun 30, 2024

I you are happy with this PR then it would be convenient if it were merged into PROD (it just contains some more refined exception handling so it should be harmless). Then I can rebase the PRs I am working on. I occasionally observe the bug in local testing but currently there is not enough information to search for the culprit.

@vdbergh
Copy link
Contributor Author

vdbergh commented Jun 30, 2024

Actually maybe it is better to wait. I think I know what is happening. Somewhere get_run is called with an ObjectId as argument instead of a string.

So I will add some more code that catches this and prints a stack trace so that we can see where the error originates.

@vdbergh
Copy link
Contributor Author

vdbergh commented Jun 30, 2024

Found it I think!

            run_id = request.rundb.new_run(**data)
            run = request.rundb.get_run(run_id)

The run_id returned by new_run is an ObjectId and not a string.

So this seems to be a genuine Fishtest bug discovered by vtjson....

@vdbergh
Copy link
Contributor Author

vdbergh commented Jul 1, 2024

So this seems to be a genuine Fishtest bug discovered by vtjson....

Ah no. It was a bug introduced in PR #1993. Master has r_id=str(r_id) in get_run(). This must have fallen by the wayside. Still... vtjson discovered the bug which otherwise would have gone unnoticed.

@vdbergh
Copy link
Contributor Author

vdbergh commented Jul 1, 2024

Closing this for now. Will reopen if another mysterious validation bug appears.

@vdbergh vdbergh closed this Jul 1, 2024
@vondele
Copy link
Member

vondele commented Jul 6, 2024

@vdbergh :

Jul 04 21:19:40 ns304014 pserve[52240]: The run object 6686f52a7092ade1206f788e does not validate: run['results_info'] is not in the schema
Jul 04 21:22:33 ns304014 pserve[52240]: The run object 6683542295b0d1e881e81a7e does not validate: run['results_info'] is not in the schema
Jul 04 21:24:52 ns304014 pserve[52240]: The run object 66840ea47a1863935cee428c does not validate: run['results_info'] is not in the schema
...
Jul 06 07:24:41 ns304014 pserve[67856]: The run object 6685df2c8c84573fdb764497 does not validate: run['results_info'] is not in the schema
Jul 06 07:28:38 ns304014 pserve[67856]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 07:37:43 ns304014 pserve[67856]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 07:46:57 ns304014 pserve[67856]: The run object 66833b9895b0d1e881e81489 does not validate: run['results_info'] is not in the schema
Jul 06 07:55:58 ns304014 pserve[67856]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 07:58:59 ns304014 pserve[67856]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 08:28:25 ns304014 pserve[67856]: The run object 66833b9895b0d1e881e81489 does not validate: run['results_info'] is not in the schema
Jul 06 08:28:55 ns304014 pserve[67856]: The run object 6683542295b0d1e881e81a7e does not validate: run['results_info'] is not in the schema
Jul 06 09:13:44 ns304014 pserve[67856]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 09:25:29 ns304014 pserve[67856]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 09:57:30 ns304014 pserve[72985]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 10:03:29 ns304014 pserve[72985]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 10:24:17 ns304014 pserve[72985]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 10:33:27 ns304014 pserve[72985]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 11:40:23 ns304014 pserve[75771]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema
Jul 06 12:28:32 ns304014 pserve[75771]: The run object 6640edea70e85e1f2ce321f9 does not validate: run['results_info'] is not in the schema
Jul 06 12:40:29 ns304014 pserve[75771]: The run object 66834db695b0d1e881e816a1 does not validate: run['results_info'] is not in the schema

seems to have started on July 4th

@vondele vondele reopened this Jul 6, 2024
@vdbergh
Copy link
Contributor Author

vdbergh commented Jul 6, 2024

@vondele

Yes this is normal. The results widget is no longer stored in the run, but computed on the fly. However old runs still have it and the validation code notices this. Probably the message in the log should indicate that these are runs with an older version so the error is not really an error but expected.

@ppigazzini said he might write a script to clean the db.

@vondele vondele closed this Jul 6, 2024
@ppigazzini
Copy link
Collaborator

ppigazzini commented Jul 6, 2024

I have already written the script, but on DEV 29 documents cannot be cleaned, I'm investigating what it's wrong.

import time
from fishtest.rundb import RunDb

def remove_unwanted_keys(rundb):
    result = rundb.runs.update_many({}, {"$unset": {"results_info": ""}})
    print(f"Matched {result.matched_count} documents and modified {result.modified_count} documents.")

def count_keys(rundb):
    results_info_count = rundb.runs.count_documents({"results_info": {"$exists": True}})
    print(f"Number of documents with 'results_info': {results_info_count}")

def print_all_results_info(rundb):
    runs_with_results_info = rundb.runs.find({"results_info": {"$exists": True}})
    for run in runs_with_results_info:
        print(run.get("results_info"))

def main():
    rundb = RunDb()
    t0 = time.time()
    remove_unwanted_keys(rundb)
    print(time.time() - t0)
    t0 = time.time()
    count_keys(rundb)
    print(time.time() - t0)
    print_all_results_info(rundb)

if __name__ == "__main__":
    main()

@ppigazzini
Copy link
Collaborator

$ ${VENV}/bin/python3 db_results_info.py
Matched 160132 documents and modified 29 documents.
77.37627863883972
Number of documents with 'results_info': 29
63.11245822906494
{'style': '', 'info': ['LLR: -2.24 (-2.94,2.94) <0.50,2.50>', 'Total: 112112 W: 28258 L: 28214 D: 55640', 'Ptnml(0-2): 61, 12006, 31864, 12078, 47']}
{'style': '', 'info': ['49744/100000 iterations', '99488/200000 games played']}
{'style': '', 'info': ['LLR: -0.66 (-2.94,2.94) <-1.75,0.25>', 'Total: 624544 W: 160432 L: 161172 D: 302940', 'Ptnml(0-2): 1364, 74663, 161073, 73693, 1479']}
{'style': '', 'info': ['24481/30000 iterations', '48962/60000 games played']}
{'style': '', 'info': ['3677/100000 iterations', '7354/200000 games played']}
{'style': '', 'info': ['LLR: 1.14 (-2.94,2.94) <0.50,2.50>', 'Total: 37782 W: 9705 L: 9531 D: 18546', 'Ptnml(0-2): 57, 4083, 10436, 4259, 56']}
{'style': '', 'info': ['LLR: -1.11 (-2.94,2.94) <0.00,2.00>', 'Total: 45664 W: 11756 L: 11787 D: 22121', 'Ptnml(0-2): 129, 5397, 11780, 5428, 98']}
{'style': '', 'info': ['LLR: -0.21 (-2.94,2.94) <0.50,2.50>', 'Total: 16122 W: 4084 L: 4068 D: 7970', 'Ptnml(0-2): 33, 1777, 4421, 1801, 29']}
{'style': '', 'info': ['LLR: 0.32 (-2.94,2.94) <0.00,2.00>', 'Total: 2382 W: 592 L: 563 D: 1227', 'Ptnml(0-2): 0, 236, 690, 265, 0']}
{'style': '', 'info': ['LLR: -0.25 (-2.94,2.94) <0.00,2.00>', 'Total: 2784 W: 730 L: 749 D: 1305', 'Ptnml(0-2): 18, 346, 681, 331, 16']}
{'style': '', 'info': ['LLR: 0.29 (-2.94,2.94) <0.50,2.50>', 'Total: 8610 W: 2208 L: 2166 D: 4236', 'Ptnml(0-2): 7, 952, 2347, 990, 9']}
{'style': '', 'info': ['LLR: 0.69 (-2.94,2.94) <0.00,2.00>', 'Total: 24704 W: 6489 L: 6390 D: 11825', 'Ptnml(0-2): 85, 2936, 6204, 3049, 78']}
{'style': '', 'info': ['LLR: 0.13 (-2.94,2.94) <-1.75,0.25>', 'Total: 8416 W: 2186 L: 2184 D: 4046', 'Ptnml(0-2): 33, 982, 2168, 1000, 25']}
{'style': '', 'info': ['LLR: -0.41 (-2.94,2.94) <0.00,2.00>', 'Total: 896 W: 200 L: 232 D: 464', 'Ptnml(0-2): 0, 112, 256, 80, 0']}
{'style': '', 'info': ['LLR: -0.16 (-2.94,2.94) <0.00,2.00>', 'Total: 634 W: 147 L: 159 D: 328', 'Ptnml(0-2): 0, 75, 179, 63, 0']}
{'style': '', 'info': ['LLR: 0.20 (-2.94,2.94) <0.00,2.00>', 'Total: 3680 W: 969 L: 945 D: 1766', 'Ptnml(0-2): 16, 429, 924, 457, 14']}
{'style': '', 'info': ['LLR: -0.36 (-2.94,2.94) <0.00,2.00>', 'Total: 2016 W: 524 L: 553 D: 939', 'Ptnml(0-2): 6, 241, 547, 204, 10']}
{'style': '', 'info': ['LLR: -2.02 (-2.94,2.94) <0.00,2.00>', 'Total: 16224 W: 4174 L: 4331 D: 7719', 'Ptnml(0-2): 57, 2010, 4124, 1875, 46']}
{'style': '#44EB44', 'info': ['Elo: 41.08 ± 3.2 (95%) LOS: 100.0%', 'Total: 10842 W: 3479 L: 2203 D: 5160', 'Ptnml(0-2): 11, 710, 2733, 1926, 41']}
{'style': '#44EB44', 'info': ['Elo: 37.22 ± 4.2 (95%) LOS: 100.0%', 'Total: 5266 W: 1684 L: 1122 D: 2460', 'Ptnml(0-2): 0, 280, 1515, 834, 4']}
{'style': '', 'info': ['LLR: 0.11 (-2.94,2.94) <0.00,2.00>', 'Total: 64 W: 19 L: 8 D: 37', 'Ptnml(0-2): 0, 4, 13, 15, 0']}
{'style': '', 'info': ['LLR: -1.21 (-2.94,2.94) <-1.75,0.25>', 'Total: 6656 W: 1658 L: 1776 D: 3222', 'Ptnml(0-2): 34, 863, 1638, 773, 20']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}
{'style': '', 'info': ['LLR: 0.00 (-2.94,2.94) <0.00,2.00>', 'Total: 0 W: 0 L: 0 D: 0', 'Ptnml(0-2): 0, 0, 0, 0, 0']}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug server server side changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants