-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The RSS bot started skipping entries again #844
Comments
I noticed it for entries on the following feed about 13 hours ago if you would like to investigate a specific case: |
The bot also failed to post new entries from here: https://gitlab.com/bkil/secuchart/-/commits/master.atom The matrix room was recently upgraded if this is of any relevance. I made sure to unfollow the feed and remove the integration before upgrading the old room and then readd the integration and feed to the new room: |
Still happens as of the past few days, including today. We have temporarily hooked up another RSS bot in the room to compare against so it is more visible. |
Hey, we do have some sample feeds we monitor for issues, but of course it doesn't provide an overall picture. Do you have feed urls for the ones that are failing? Even if we miss an item, we would retry it on next fetch unless we have already marked it as handled. |
Ah you posted above https://bkil-bot.github.io/osm-rss-juggler/changeset-discuss.xml, got it. |
Yes, that sometimes skips. Today it was from this one: And specifically these two entries got lost 6 hours ago: |
Could it happen that it tries to submit the feed comment to the given matrix room but depending on load, the matrix HS API NACK's it, while Hookshot marks the notification as delivered, clearing it from its queue? |
This is my current theory yes. We clear it from the queue at the point where it is parsed from the feed, but we don't retry if it fails to be delivered. Essentially a victim of the lack of #725 |
A few other ones were missing from the same feed a few days ago. One that we also caught some days ago was lack of a new commit notification from a low activity gitlab.com repo. We also missed github.com releases in the past. |
Would it be possible to log such failures so the suspected root cause could be confirmed to be the same, and not caused by another issue? |
Ah yes:
|
You know the source of the entries better than I do, but it looks like a different problem. As if it failed to ingest the feed itself. |
This is definitely matrix-side. The We already have metrics on this and for integrations.ems.host it's failing a small % of the time. I expect #891 will improve this. |
@bkil Could you let me know if this gets better or worse, I've deployed some experimental patches to the integrations server and would like to know if we're still seeing issues? EDIT: A glance at the logs suggest the main culprits have been dealt with. |
Many of us noticed that the RSS bot started reproducing old feed entries from earlier (possibly ones that it skipped previously) Is this an expected side effect? |
@bkil Yep, there was a mistaken cache eviction. That should be fixed, can you let me know if either that happens again OR there is a missed feed? |
@Half-Shot what I can see this feed https://github.com/iSoron/uhabits/releases.atom did not update yet, last post in room is 2.1.3 (Aug 28, '23) and 2.2.0 never appeared. |
@mahdi1234 I think you won't see previously skipped entries retroactively. Please watch out for skipped new entries from this point in time forward. |
Well my comment was based on |
Yes, for absolute clarity the issue is sadly that hookshot stores a feed as checked, and then fails to send the message to Matrix. It won't recover those events. This should now be rare. |
@Half-Shot Could you please look into the logs again for me about a lost entry? And added by this git commit 6 hours ago: Because this feed rarely changes (and was empty for some time up to this point), the other RSS bot also just noticed it 2 hours ago. |
Hm, it claims to have handled one feed item from that feed, but it might have been hit by #806? When we connect a feed for the first time, we store all the guids that exist in cache and then handle any new ones from that point. Did you subscribe to the feed when it was still entirely empty? EDIT: From reading the code this is likely tangled up with #806-ish. The first non-empty read from a feed is ignored, so if you have no items in you feed upon subscribe, and then you add one item, then it's still ignored. The subsequent reads should be okay. Since it's a bit suble, I'm tracking as #893. |
No, we have been subscribed to this feed for months, and the bot had most recently forwarded a post from it on the 6th of February. It's just that due to the dynamic nature of such a wiki feed, it can go blank for days, and then get non-blank for a day or two at a time as long as the edit horizon contains hits to show. So this might be a different variant or special recurring case of #893. |
New information. One entry from two feeds each got lost that were published by the following commit 2.5h ago: bkil-bot/osm-rss-juggler@758e71e Feed URLs and GUIDs: https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml https://bkil-bot.github.io/osm-rss-juggler/mastodon-openstreetmap.xml |
All five added entries from this feed got lost 22 hours ago: Feed URL: GUIDs:
Could you perhaps have a look? |
The following entry was lost 1 hour ago: Feed URL: https://bkil-bot.github.io/osm-rss-juggler/changeset-discuss.xml GUID:
|
The following entry was lost 1 hour ago: Feed URL: https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml GUID:
Note that due to its instability, most people I know with knowledge of how to debug such issues had stopped using hookshot in the past months and replaced it with other solutions. Hence it will probably only be me contributing reports from now on. |
Another report. These feeds have a new entry every Sunday morning. The 4 entries of February 18th were all posted. The 8 entries of February 25th and March 3rd were all skipped. The bot says "Successful fetch" and no error is posted in the room, when the feed is expected to discover an entry. https://www.weeklyosm.eu/en/feed |
Another report just to let know this is still an issue. Tooted a new toot to Mastodon, feed URL https://some.hacklab.fi/@tampere.rss Feeds says successful fetch but nothing posted to room. |
Still working very unreliably. I tooted two toots with #sslnr hashtag. First toot was never relayed to Matrix, the second one was. Feed URL: https://sauna.social/tags/sslnr.rss Has there been any progress on the issue, or should we migrate to for example t2bot's RSS bot? |
I can confirm the same issue. The last feed I got from https://xkcd.com/rss.xml is on April 26th, while there has been at least five more posts since then. Not only that, the bot reports several successful checks today. No posts though. |
I noticed it only accidentally after I know that something should have appeared as a message. Could you perhaps hook up some kind of automated monitoring of this?
As a continuation of #778
The text was updated successfully, but these errors were encountered: