-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bot going beserk, reacting multiple times to same event #274
Comments
It actually happened a first time on 2 July 2024 (~10:32 UTC), see the reactions to EESSI/software-layer#630 (comment) . On July 3rd 2024 (~10:12 UTC), it happened again in reaction to EESSI/software-layer#630 (comment), but the bot was replying dozens of times to the same event (I count 48 in total). I killed my bot instance shortly after to prevent it from causing more trouble. |
I encountered the same issue (or something very similar) on July 3rd 2024 (~07:47 UTC) on a local instance of the bot on the HPC system at RUG: Neves-Bot/software-layer#40 (comment) Here, the bot was stuck on a build command and started a bunch of jobs, most of which failed (checksums failed most likely due to stressed I/O). Abridged `event_handler.sh` log
[20240703-T09:47:23] [start]: EESSI bot for software layer started!
[20240703-T09:47:23] [start]: app is listening on port 3000
[20240703-T09:47:23] [start]: logging in to /home1/f115372/bot/eessi_bot_event_handler.log
[20240703-T09:47:37] [handle_issue_comment_event]: Comment in https://api.github.com/repos/Neves-Bot/software-layer/issues/40 (owned by @Neves-P) created by @Neves-P
[20240703-T09:47:37] [handle_issue_comment_event]: Comment in https://api.github.com/repos/Neves-Bot/software-layer/issues/40 (owned by @Neves-P) created by @Neves-P
[20240703-T09:47:37] [handle_issue_comment_event]: comment action 'created' is handled
[20240703-T09:47:37] [handle_issue_comment_event]: comment action 'created' is handled
[20240703-T09:47:37] [handle_issue_comment_event]: Comment in https://api.github.com/repos/Neves-Bot/software-layer/issues/40 (owned by @Neves-P) created by @Neves-P
[20240703-T09:47:37] [handle_issue_comment_event]: Comment in https://api.github.com/repos/Neves-Bot/software-layer/issues/40 (owned by @Neves-P) created by @Neves-P
[20240703-T09:47:37] [handle_issue_comment_event]: comment action 'created' is handled
[20240703-T09:47:37] [handle_issue_comment_event]: comment action 'created' is handled
[20240703-T09:47:37] [handle_issue_comment_event]: account `Neves-P` has permission to send commands to bot
[20240703-T09:47:37] [handle_issue_comment_event]: account `Neves-P` has permission to send commands to bot
[20240703-T09:47:37] [handle_issue_comment_event]: account `Neves-P` has permission to send commands to bot
[20240703-T09:47:37] [handle_issue_comment_event]: account `Neves-P` has permission to send commands to bot
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/intel/icelake'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/intel/icelake'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/intel/icelake'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/intel/icelake'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/amd/zen3'
[20240703-T09:47:37] [handle_issue_comment_event]: comment response: '
- received bot command `build repo:hpc.rug.nl arch:x86_64/intel/icelake` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/intel/icelake`
- received bot command `build repo:hpc.rug.nl arch:x86_64/amd/zen3` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/amd/zen3`'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/amd/zen3'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/amd/zen3'
[20240703-T09:47:37] [handle_issue_comment_event]: comment response: '
- received bot command `build repo:hpc.rug.nl arch:x86_64/intel/icelake` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/intel/icelake`
- received bot command `build repo:hpc.rug.nl arch:x86_64/amd/zen3` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/amd/zen3`'
[20240703-T09:47:37] [handle_issue_comment_event]: comment response: '
- received bot command `build repo:hpc.rug.nl arch:x86_64/intel/icelake` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/intel/icelake`
- received bot command `build repo:hpc.rug.nl arch:x86_64/amd/zen3` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/amd/zen3`'
[20240703-T09:47:37] [handle_issue_comment_event]: found bot command: 'build repo:hpc.rug.nl arch:x86_64/amd/zen3'
[20240703-T09:47:37] [handle_issue_comment_event]: comment response: '
- received bot command `build repo:hpc.rug.nl arch:x86_64/intel/icelake` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/intel/icelake`
- received bot command `build repo:hpc.rug.nl arch:x86_64/amd/zen3` from `Neves-P`
- expanded format: `build repository:hpc.rug.nl architecture:x86_64/amd/zen3`'
[20240703-T09:47:39] [handle_bot_command]: Handling bot command build
[20240703-T09:47:39] [handle_bot_command_build]: repository: 'Neves-Bot/software-layer'
[20240703-T09:47:39] [handle_bot_command]: Handling bot command build
[20240703-T09:47:39] [handle_bot_command_build]: repository: 'Neves-Bot/software-layer'
[20240703-T09:47:39] [handle_bot_command]: Handling bot command build
[20240703-T09:47:39] [handle_bot_command_build]: repository: 'Neves-Bot/software-layer'
[20240703-T09:47:39] [handle_bot_command]: Handling bot command build
[20240703-T09:47:39] [handle_bot_command_build]: repository: 'Neves-Bot/software-layer'
[20240703-T09:47:44] [handle_issue_comment_event]: handling command 'build repository:hpc.rug.nl architecture:x86_64/intel/icelake' resulted in '
- submitted job `11142535`, for details & status see https://github.com/Neves-Bot/software-layer/pull/40#issuecomment-2205322614'
Will edit this comment with mode details ASAP Edit: correction, this started earlier, with only one repetition at Neves-Bot/software-layer#40 (comment) On the For reference, a day earlier on a previous PR, I commented with the same command Neves-Bot/software-layer#38 (comment) which resulted in in
The event id
This goes on for a few lines. I've checked the event |
From
Likewise, the event that corresponds to the comment on 3 July was received multiple times:
The good news is that we should be able to make the bot robust against "echoes" of events relatively easily: it should just keep track of the N last events it received, and refuse to react multiple times to the same event? This should probably be fixed in PyGHee rather than in the EESSI bot implementation though; see I'm not sure how easy it is to implement though, since those echo events seem to be coming in all at the same time, and incoming events are not processed serially, see also partial log below
|
@Neves-P I think the changes in boegel/PyGHee#7 will do the trick... If that looks good to you, we can merge it, and test it in our own bot deployments before updating to PyGHee 0.0.4 in the production bots we have running for EESSI (which are currently unaffected because they're not using The latter is know to be far from perfect, so we may need to figure out another solution there anyway, see also probot/smee.io#137 |
I've resumed by bot on Deucalion, it's running on top of the changes in boegel/PyGHee#7, to test... |
Looks cool! I'll try it out on Hábrók too |
Resumed bot on Hábrók with PyGHee installed from boegel/PyGHee#7 and so far so good: Neves-Bot/software-layer#42. |
Recently (on 3 July 2024), the bot I had running on Deucalion was reacting multiple times to the same event, resulting in a series of identical comments, see for example EESSI/software-layer#630 (comment).
@Neves-P mentioned that he saw a similar problem with a test instance of the bot he was playing with, in a totally different context (at RUG), so it seems like a general problem, not specific to the setup I did on Deucalion (which had been working fine for days already before this happened).
To me, it seems that
smee.io
is actually the culprit here, and that it was echo'ing events coming from GitHub multiple times to the smee channel used by the bot.That would also explain why the production bots running in AWS & Azure were not affected, because they're using a custom smee instance (
smee.nessi.no
).We should:
The text was updated successfully, but these errors were encountered: