Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraper rules don't work with theguardian.com #2688

Open
advert665 opened this issue Jun 11, 2024 · 0 comments
Open

Scraper rules don't work with theguardian.com #2688

advert665 opened this issue Jun 11, 2024 · 0 comments

Comments

@advert665
Copy link

advert665 commented Jun 11, 2024

The Guardian publishes summaries in thier rss feeds, so I want to use the scraper rules to load the full content from the corresponding webpage. However, when I use a selector that corresponds to the desired content on the webpage it won't load.

For instance, using div#maincontent or p.dcr-iy9ec7, fails to change the resulting article in miniflux for the following feed, even though they select elements in the linked pages: https://www.theguardian.com/theguardian/mainsection/topstories/rss

Similarly, using picture to extract the cartoons from https://www.theguardian.com/profile/martinrowson/rss (with or without the add_dynamic_image rule), fails to load anything in miniflux.

Other RSS apps like Lire are able to load the full articles so it's not a Guardian issue specifically. Am I doing something wrong or is this a Miniflux limitation? Thanks!

Screenshot 2024-06-11 102143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

1 participant