Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The RSS bot started skipping entries again #844

Open
bkil opened this issue Nov 15, 2023 · 31 comments
Open

The RSS bot started skipping entries again #844

bkil opened this issue Nov 15, 2023 · 31 comments
Assignees
Labels
S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect Bugs, crashes, hangs, vulnerabilities, or other reported problems.

Comments

@bkil
Copy link

bkil commented Nov 15, 2023

I noticed it only accidentally after I know that something should have appeared as a message. Could you perhaps hook up some kind of automated monitoring of this?

As a continuation of #778

@bkil
Copy link
Author

bkil commented Nov 15, 2023

I noticed it for entries on the following feed about 13 hours ago if you would like to investigate a specific case:

@bkil
Copy link
Author

bkil commented Nov 16, 2023

The bot also failed to post new entries from here:

https://gitlab.com/bkil/secuchart/-/commits/master.atom

The matrix room was recently upgraded if this is of any relevance. I made sure to unfollow the feed and remove the integration before upgrading the old room and then readd the integration and feed to the new room:

https://matrix.to/#/#secuchart:matrix.org

@bkil
Copy link
Author

bkil commented Feb 6, 2024

Still happens as of the past few days, including today. We have temporarily hooked up another RSS bot in the room to compare against so it is more visible.

@Half-Shot
Copy link
Contributor

Half-Shot commented Feb 6, 2024

Hey, we do have some sample feeds we monitor for issues, but of course it doesn't provide an overall picture. Do you have feed urls for the ones that are failing? Even if we miss an item, we would retry it on next fetch unless we have already marked it as handled.

@Half-Shot
Copy link
Contributor

Ah you posted above https://bkil-bot.github.io/osm-rss-juggler/changeset-discuss.xml, got it.

@bkil
Copy link
Author

bkil commented Feb 6, 2024

Yes, that sometimes skips. Today it was from this one:
https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml

And specifically these two entries got lost 6 hours ago:
bkil-bot/osm-rss-juggler@c341e29#diff-7b943466d8c0bf9f148be2a11b68c00cb75f48091a82cd67636ff05ad3f0cd06R6

@bkil
Copy link
Author

bkil commented Feb 6, 2024

Could it happen that it tries to submit the feed comment to the given matrix room but depending on load, the matrix HS API NACK's it, while Hookshot marks the notification as delivered, clearing it from its queue?

@Half-Shot
Copy link
Contributor

This is my current theory yes. We clear it from the queue at the point where it is parsed from the feed, but we don't retry if it fails to be delivered. Essentially a victim of the lack of #725

@bkil
Copy link
Author

bkil commented Feb 6, 2024

A few other ones were missing from the same feed a few days ago. One that we also caught some days ago was lack of a new commit notification from a low activity gitlab.com repo. We also missed github.com releases in the past.

@bkil
Copy link
Author

bkil commented Feb 6, 2024

Would it be possible to log such failures so the suspected root cause could be confirmed to be the same, and not caused by another issue?

@Half-Shot
Copy link
Contributor

Ah yes:

	
2024-02-04 12:33:42.182	
WARN 12:33:42:181 [Bridge] Connection FeedConnection https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml failed to handle feed.entry: Error: ESOCKETTIMEDOUT
	
	
2024-02-04 11:28:20.753	
WARN 11:28:20:753 [Bridge] Connection FeedConnection https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml failed to handle feed.entry: Error: ESOCKETTIMEDOUT
	
	
2024-02-03 06:32:37.183	
WARN 06:32:37:183 [Bridge] Connection FeedConnection https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml failed to handle feed.entry: Error: ESOCKETTIMEDOUT

@bkil
Copy link
Author

bkil commented Feb 6, 2024

You know the source of the entries better than I do, but it looks like a different problem. As if it failed to ingest the feed itself.

@Half-Shot
Copy link
Contributor

This is definitely matrix-side. The Connection FeedConnection...failed to handle feed.entry error means we've managed to parse out the feed already but failed to handle a feed.entry event, which is what is emitted from the parser. A.k.a we very much can narrow down the problem to a Bridge <-> HS failure.

We already have metrics on this and for integrations.ems.host it's failing a small % of the time. I expect #891 will improve this.

@Half-Shot
Copy link
Contributor

Half-Shot commented Feb 6, 2024

@bkil Could you let me know if this gets better or worse, I've deployed some experimental patches to the integrations server and would like to know if we're still seeing issues?

EDIT: A glance at the logs suggest the main culprits have been dealt with.

@bkil
Copy link
Author

bkil commented Feb 7, 2024

Many of us noticed that the RSS bot started reproducing old feed entries from earlier (possibly ones that it skipped previously) Is this an expected side effect?

@Half-Shot
Copy link
Contributor

@bkil Yep, there was a mistaken cache eviction. That should be fixed, can you let me know if either that happens again OR there is a missed feed?

@Half-Shot Half-Shot added T-Defect Bugs, crashes, hangs, vulnerabilities, or other reported problems. S-Major Severely degrades major functionality or product features, with no satisfactory workaround labels Feb 8, 2024
@mahdi1234
Copy link

@Half-Shot what I can see this feed https://github.com/iSoron/uhabits/releases.atom did not update yet, last post in room is 2.1.3 (Aug 28, '23) and 2.2.0 never appeared.

@bkil
Copy link
Author

bkil commented Feb 8, 2024

@mahdi1234 I think you won't see previously skipped entries retroactively. Please watch out for skipped new entries from this point in time forward.

@mahdi1234
Copy link

Well my comment was based on OR there is a missed feed to me it looks like in all rooms I've seen it takes current full feed and re-post. But I might be wrong :)

@Half-Shot
Copy link
Contributor

Yes, for absolute clarity the issue is sadly that hookshot stores a feed as checked, and then fails to send the message to Matrix. It won't recover those events. This should now be rare.

@bkil
Copy link
Author

bkil commented Feb 9, 2024

@Half-Shot Could you please look into the logs again for me about a lost entry?
It was under this path:
https://bkil-bot.github.io/osm-rss-juggler/wiki.xml

And added by this git commit 6 hours ago:
bkil-bot/osm-rss-juggler@4cf05c2

Because this feed rarely changes (and was empty for some time up to this point), the other RSS bot also just noticed it 2 hours ago.

@Half-Shot
Copy link
Contributor

Half-Shot commented Feb 12, 2024

Hm, it claims to have handled one feed item from that feed, but it might have been hit by #806? When we connect a feed for the first time, we store all the guids that exist in cache and then handle any new ones from that point.

Did you subscribe to the feed when it was still entirely empty?

EDIT: From reading the code this is likely tangled up with #806-ish. The first non-empty read from a feed is ignored, so if you have no items in you feed upon subscribe, and then you add one item, then it's still ignored. The subsequent reads should be okay. Since it's a bit suble, I'm tracking as #893.

@bkil
Copy link
Author

bkil commented Feb 12, 2024

No, we have been subscribed to this feed for months, and the bot had most recently forwarded a post from it on the 6th of February.
bkil-bot/osm-rss-juggler@d151dd8

It's just that due to the dynamic nature of such a wiki feed, it can go blank for days, and then get non-blank for a day or two at a time as long as the edit horizon contains hits to show. So this might be a different variant or special recurring case of #893.

@bkil
Copy link
Author

bkil commented Feb 12, 2024

New information. One entry from two feeds each got lost that were published by the following commit 2.5h ago:

bkil-bot/osm-rss-juggler@758e71e

Feed URLs and GUIDs:

https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml
GUID https://www.openstreetmap.org/note/4057931#c8972088

https://bkil-bot.github.io/osm-rss-juggler/mastodon-openstreetmap.xml
GUID https://en.osm.town/@openstreetmap/111919104056767098

@bkil
Copy link
Author

bkil commented Feb 26, 2024

All five added entries from this feed got lost 22 hours ago:

bkil-bot/osm-rss-juggler@bb3aa92#diff-7b943466d8c0bf9f148be2a11b68c00cb75f48091a82cd67636ff05ad3f0cd06R6

Feed URL:

GUIDs:

  • https://www.openstreetmap.org/note/4127077#c9010944
  • https://www.openstreetmap.org/note/4127075#c9010941
  • https://www.openstreetmap.org/note/4127074#c9010938
  • https://www.openstreetmap.org/note/4127069#c9010924
  • https://www.openstreetmap.org/note/4127068#c9010921

Could you perhaps have a look?

@bkil
Copy link
Author

bkil commented Feb 27, 2024

The following entry was lost 1 hour ago:

bkil-bot/osm-rss-juggler@5f70b45#diff-ec44ff024548ccff9e78ec7ebb5a61b78b6eef11b15ad0ad2ba020a46f666163R6

Feed URL:

https://bkil-bot.github.io/osm-rss-juggler/changeset-discuss.xml

GUID:

https://www.openstreetmap.org/changeset/147962763#comment_1

@bkil
Copy link
Author

bkil commented Mar 1, 2024

The following entry was lost 1 hour ago:

bkil-bot/osm-rss-juggler@6ae88ae#diff-7b943466d8c0bf9f148be2a11b68c00cb75f48091a82cd67636ff05ad3f0cd06R6

Feed URL:

https://bkil-bot.github.io/osm-rss-juggler/osm-notes.xml

GUID:

https://www.openstreetmap.org/note/4048599#c9028313

Note that due to its instability, most people I know with knowledge of how to debug such issues had stopped using hookshot in the past months and replaced it with other solutions. Hence it will probably only be me contributing reports from now on.

@opk12
Copy link

opk12 commented Mar 4, 2024

Another report. These feeds have a new entry every Sunday morning.

The 4 entries of February 18th were all posted. The 8 entries of February 25th and March 3rd were all skipped. The bot says "Successful fetch" and no error is posted in the room, when the feed is expected to discover an entry.

https://www.weeklyosm.eu/en/feed
https://www.weeklyosm.eu/fr/feed
https://www.weeklyosm.eu/es/feed
https://weeklyosm.eu/it/feed

@vranki
Copy link

vranki commented Mar 25, 2024

Another report just to let know this is still an issue.

Tooted a new toot to Mastodon, feed URL https://some.hacklab.fi/@tampere.rss

Feeds says successful fetch but nothing posted to room.

@vranki
Copy link

vranki commented Apr 5, 2024

Still working very unreliably. I tooted two toots with #sslnr hashtag. First toot was never relayed to Matrix, the second one was. Feed URL: https://sauna.social/tags/sslnr.rss

Has there been any progress on the issue, or should we migrate to for example t2bot's RSS bot?

@jonnyrobbie
Copy link

jonnyrobbie commented May 10, 2024

I can confirm the same issue. The last feed I got from https://xkcd.com/rss.xml is on April 26th, while there has been at least five more posts since then. Not only that, the bot reports several successful checks today. No posts though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect Bugs, crashes, hangs, vulnerabilities, or other reported problems.
Projects
None yet
Development

No branches or pull requests

6 participants