You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the downloader middleware and the request is not found, request the live resource. Add a setting or something alike that we can use the control this behaviour.
The text was updated successfully, but these errors were encountered:
leewesleyv
changed the title
When using the downloader middleware and the request is not found, request the live resource
Support fetching live resources in downloader middleware
Oct 22, 2024
Great idea. I would say this is ok to leave this for after the package has been published.
When you want to crawl the resulting WACZ (containing new resources), you probably want to crawl it together with the other WACZ (containing older resources). And if the old WACZ also was crawled as an 'update' to a previous one, you need to specify all of them when crawling it.
I think creating a WACZ manifest could help with this, so you can reference one file to re-crawl. Its specification is a work-in-progress, but a tool like replayweb.page already supports it afaik - see webrecorder/specs#112 for the spec in progress.
When using the downloader middleware and the request is not found, request the live resource. Add a setting or something alike that we can use the control this behaviour.
The text was updated successfully, but these errors were encountered: