You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideally we want to be able to optionally invoke the spider to run against the most recent archive by passing a CLI option or environment variable (picked up in settings).
The text was updated successfully, but these errors were encountered:
Since the spider knows where to locate object storage, it is relatively easy to figure this out. If a system talking to Scrapy needs to figure this it by itself, it needs to know container storage details.
As an addition to this feature, one could also perhaps provide a date/timestamp to locate the last archive before that (but that may depend on the configured storage path, so could be tricky).
wvengen
changed the title
Provide a CLI option or setting to run a spider against a specific archive
Provide a setting to run a spider against a specific archive
Jan 21, 2025
Ideally we want to be able to optionally invoke the spider to run against the most recent archive by passing a CLI option or environment variable (picked up in settings).
The text was updated successfully, but these errors were encountered: