Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce recursivity level for sitemap fetcher #5

Open
pypt opened this issue Nov 27, 2018 · 1 comment
Open

Reduce recursivity level for sitemap fetcher #5

pypt opened this issue Nov 27, 2018 · 1 comment
Labels
bug Something isn't working

Comments

@pypt
Copy link
Contributor

pypt commented Nov 27, 2018

10 levels deep is probably too much:

2018-11-26 13:11:19,139 INFO mediawords.util.sitemap.helpers
[162086/MainThread]: Fetching URL
https://www.juiceplus.com/fr/fr/franchise/sitemap.xml...
2018-11-26 13:11:19,428 INFO mediawords.util.sitemap.fetchers
[162086/MainThread]: Parsing sitemap from URL
https://www.juiceplus.com/fr/fr/franchise/sitemap.xml...
2018-11-26 13:11:19,508 INFO mediawords.util.sitemap.fetchers
[162086/MainThread]: Fetching level 8 sitemap from
https://www.juiceplus.com/il/en/franchise/sitemap.xml...
2018-11-26 13:11:19,508 INFO mediawords.util.sitemap.helpers
[162086/MainThread]: Fetching URL
https://www.juiceplus.com/il/en/franchise/sitemap.xml...
@pypt pypt added enhancement New feature or request bug Something isn't working and removed enhancement New feature or request labels Nov 27, 2018
@nubonics
Copy link

No. The purpose of a sitemap is to show every single page on the website, lowering the depth would result in an invalid sitemap extraction. I completely disagree that this is a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants