Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple nodes crashed #16397

Open
2 tasks done
deepthiskumar opened this issue Dec 4, 2024 · 0 comments
Open
2 tasks done

multiple nodes crashed #16397

deepthiskumar opened this issue Dec 4, 2024 · 0 comments
Labels
bug mina-node Issues related to all types of mina node triage

Comments

@deepthiskumar
Copy link
Member

Preliminary Checks

Description

Multiple synced nodes crashed with the following exception

Dec 04 17:29:11 sp1004031 mina[431176]:   exn: {
Dec 04 17:29:11 sp1004031 mina[431176]:   "sexp": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "monitor.ml.Error",
Dec 04 17:29:11 sp1004031 mina[431176]:     [ "Not_found_s", [ "Map.find_exn: not found", [ "5010000", "1" ] ] ],
Dec 04 17:29:11 sp1004031 mina[431176]:     [
Dec 04 17:29:11 sp1004031 mina[431176]:       "Raised at Base__Map.Tree0.find_exn.if_not_found in file \"src/map.ml\", line 519, characters 6-84",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Base__Map.Accessors.find_exn in file \"src/map.ml\" (inlined), line 1681, characters 4-118",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Network_pool__Map_set.remove_exn in file \"src/lib/network_pool/map_set.ml\", line 12, characters 15-33",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Network_pool__Indexed_pool.remove_applicable_exn in file \"src/lib/network_pool/indexed_pool.ml\", line 325, characters 24-77",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Network_pool__Indexed_pool.revalidate.(fun) in file \"src/lib/network_pool/indexed_pool.ml\", line 795, characters 23-54",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Base__Map.Tree0.fold in file \"src/map.ml\", line 838, characters 46-87",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Base__Map.Tree0.fold in file \"src/map.ml\", line 838, characters 64-86",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Network_pool__Transaction_pool.Make0.Resource_pool.handle_transition_frontier_diff in file \"src/lib/network_pool/transaction_pool.ml\", line 676, characters 8-104",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Base__Result.try_with in file \"src/result.ml\", line 195, characters 9-15",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Caught by monitor coda"
Dec 04 17:29:11 sp1004031 mina[431176]:     ]
Dec 04 17:29:11 sp1004031 mina[431176]:   ],
Dec 04 17:29:11 sp1004031 mina[431176]:   "backtrace": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "Raised at Stdlib__String.index_rec in file \"string.ml\", line 128, characters 19-34",
Dec 04 17:29:11 sp1004031 mina[431176]:     "Called from Sexplib0__Sexp.Printing.index_of_newline in file \"src/sexp.ml\", line 113, characters 13-47"
Dec 04 17:29:11 sp1004031 mina[431176]:   ]
Dec 04 17:29:11 sp1004031 mina[431176]: }
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Fatal] Unhandled top-level exception: $exn
Dec 04 17:29:11 sp1004031 mina[431176]: Generating crash report
Dec 04 17:29:11 sp1004031 mina[431176]:   exn: {
Dec 04 17:29:11 sp1004031 mina[431176]:   "sexp": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "monitor.ml.Error",
Dec 04 17:29:11 sp1004031 mina[431176]:     [
Dec 04 17:29:11 sp1004031 mina[431176]:       "Failure",
Dec 04 17:29:11 sp1004031 mina[431176]:       "sync timing task `processing_transaction_pool_transition_frontier_diffs` failed, exception reported to parent monitor"
Dec 04 17:29:11 sp1004031 mina[431176]:     ],
Dec 04 17:29:11 sp1004031 mina[431176]:     [
Dec 04 17:29:11 sp1004031 mina[431176]:       "Raised at Stdlib.failwith in file \"stdlib.ml\", line 29, characters 17-33",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from O1trace.exec_thread in file \"src/lib/o1trace/o1trace.ml\", line 82, characters 6-27",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Pipe_lib__Strict_pipe.Reader0.Merge.iter_sync.(fun) in file \"src/lib/pipe_lib/strict_pipe.ml\", line 173, characters 57-60",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Pipe_lib__Strict_pipe.Reader0.Merge.iter.read_deferred.(fun) in file \"src/lib/pipe_lib/strict_pipe.ml\", line 162, characters 30-39",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Async_kernel__Deferred0.bind.(fun) in file \"src/deferred0.ml\", line 54, characters 64-69",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Async_kernel__Job_queue.run_job in file \"src/job_queue.ml\" (inlined), line 128, characters 2-5",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 169, characters 6-47",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Caught by monitor coda"
Dec 04 17:29:11 sp1004031 mina[431176]:     ]
Dec 04 17:29:11 sp1004031 mina[431176]:   ],
Dec 04 17:29:11 sp1004031 mina[431176]:   "backtrace": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "Raised at Stdlib__String.index_rec in file \"string.ml\", line 128, characters 19-34",
Dec 04 17:29:11 sp1004031 mina[431176]:     "Called from Sexplib0__Sexp.Printing.index_of_newline in file \"src/sexp.ml\", line 113, characters 13-47"
Dec 04 17:29:11 sp1004031 mina[431176]:   ]
Dec 04 17:29:11 sp1004031 mina[431176]: }
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Info] Rebroadcasting $state_hash
Dec 04 17:29:11 sp1004031 mina[431176]:   state_hash: "3NKmTE9aY3cyXRdNbjFwf1W8e9U98qEk3zkCzKvyAbvb251VUrWb"
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Info] Saw block with state hash $state_hash
Dec 04 17:29:11 sp1004031 mina[431176]:   state_hash: "3NKmTE9aY3cyXRdNbjFwf1W8e9U98qEk3zkCzKvyAbvb251VUrWb"
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Fatal] libp2p_helper process died unexpectedly: "died after receiving sigkill (signal number 9)"
Dec 04 17:29:11 sp1004031 mina[431176]:
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Error] Encountered $error while asking libp2p_helper for peers
Dec 04 17:29:11 sp1004031 mina[431176]:   error: { "string": "libp2p_helper process died before answering" }
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Error] Encountered $error while asking libp2p_helper for peers
Dec 04 17:29:11 sp1004031 mina[431176]:   error: { "string": "libp2p_helper process died before answering" }
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Error] Could not send error report: Node_error_service was not configured
Dec 04 17:29:11 sp1004031 mina[431176]:
Dec 04 17:29:11 sp1004031 mina[431176]: 2024-12-04 16:29:11 UTC [Fatal] Unhandled top-level exception: $exn
Dec 04 17:29:11 sp1004031 mina[431176]: Generating crash report
Dec 04 17:29:11 sp1004031 mina[431176]:   exn: {
Dec 04 17:29:11 sp1004031 mina[431176]:   "sexp": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "monitor.ml.Error",
Dec 04 17:29:11 sp1004031 mina[431176]:     [
Dec 04 17:29:11 sp1004031 mina[431176]:       "exn.ml.Reraised",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Mina_net2 raised an exception",
Dec 04 17:29:11 sp1004031 mina[431176]:       [
Dec 04 17:29:11 sp1004031 mina[431176]:         "monitor.ml.Error",
Dec 04 17:29:11 sp1004031 mina[431176]:         [ "Mina_net2__Libp2p_helper.Libp2p_helper_died_unexpectedly" ],
Dec 04 17:29:11 sp1004031 mina[431176]:         [
Dec 04 17:29:11 sp1004031 mina[431176]:           "Raised at Mina_net2__Libp2p_helper.handle_libp2p_helper_termination.(fun) in file \"src/lib/mina_net2/libp2p_helper.ml\", line 146, characters 8-45",
Dec 04 17:29:11 sp1004031 mina[431176]:           "Called from Async_kernel__Deferred1.M.map.(fun) in file \"src/deferred1.ml\", line 17, characters 40-45",
Dec 04 17:29:11 sp1004031 mina[431176]:           "Called from Async_kernel__Job_queue.run_job in file \"src/job_queue.ml\" (inlined), line 128, characters 2-5",
Dec 04 17:29:11 sp1004031 mina[431176]:           "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 169, characters 6-47",
Dec 04 17:29:11 sp1004031 mina[431176]:           "Caught by monitor at file \"src/lib/gossip_net/libp2p.ml\", line 254, characters 31-31"
Dec 04 17:29:11 sp1004031 mina[431176]:         ]
Dec 04 17:29:11 sp1004031 mina[431176]:       ]
Dec 04 17:29:11 sp1004031 mina[431176]:     ],
Dec 04 17:29:11 sp1004031 mina[431176]:     [
Dec 04 17:29:11 sp1004031 mina[431176]:       "Raised at Base__Exn.reraise in file \"src/exn.ml\", line 59, characters 22-49",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Called from Base__Result.try_with in file \"src/result.ml\", line 195, characters 9-15",
Dec 04 17:29:11 sp1004031 mina[431176]:       "Caught by monitor coda"
Dec 04 17:29:11 sp1004031 mina[431176]:     ]
Dec 04 17:29:11 sp1004031 mina[431176]:   ],
Dec 04 17:29:11 sp1004031 mina[431176]:   "backtrace": [
Dec 04 17:29:11 sp1004031 mina[431176]:     "Raised at Base__Exn.reraise in file \"src/exn.ml\", line 59, characters 22-49",
Dec 04 17:29:11 sp1004031 mina[431176]:     "Called from Base__Result.try_with in file \"src/result.ml\", line 195, characters 9-15"
Dec 04 17:29:11 sp1004031 mina[431176]:   ]
Dec 04 17:29:11 sp1004031 mina[431176]: }

Restarting the nodes seem to have worked given block production isn't impacted

Steps to Reproduce

Still investigating

Expected Result

Nodes shouldn't crash

Actual Result

Crash

Daemon version

Seems like this is version independent as nodes with different versions have crashed

How frequently do you see this issue?

Rarely

What is the impact of this issue on your ability to run a node?

High

Status

NA

Additional information

No response

@deepthiskumar deepthiskumar added bug triage mina-node Issues related to all types of mina node labels Dec 4, 2024
georgeee added a commit that referenced this issue Dec 5, 2024
Dropped sequence was returned in reverse order, then concatenated to a
sequence in straight order.

This is part of a fix for issue #16397.
georgeee added a commit that referenced this issue Dec 5, 2024
1. Rewrite revalidate to enhance readability
2. Fix two similar issues originating from confusion between previous
   variable names `t` and `t'` ("Account no longer has permission to
   send" and "Current account nonce precedes first nonce in queue")
3. Fix the issue #16397 by ensuring removal from `applicable_by_fee` is
   done only for the previous head of queue.
georgeee added a commit that referenced this issue Dec 5, 2024
georgeee added a commit that referenced this issue Dec 5, 2024
1. Rewrite revalidate to enhance readability
2. Fix two similar issues originating from confusion between previous
   variable names `t` and `t'` ("Account no longer has permission to
   send" and "Current account nonce precedes first nonce in queue")
3. Fix the issue #16397 by ensuring removal from `applicable_by_fee` is
   done only for the previous head of queue.
georgeee added a commit that referenced this issue Dec 5, 2024
georgeee added a commit that referenced this issue Dec 5, 2024
Fix the issue #16397 by ensuring removal from `applicable_by_fee` is
done only for the previous head of queue.
georgeee added a commit that referenced this issue Dec 5, 2024
Fix the issue #16397 by ensuring removal from `applicable_by_fee` is
done only for the previous head of queue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mina-node Issues related to all types of mina node triage
Projects
Status: No status
Development

No branches or pull requests

1 participant