Currently, when a robots.txt is retrieved there are three possible outcomes:
- Fully disallow the site (e.g after 403 "Forbidden" HTTP response.)
- Fully allow the site (e.g after 404 "Not Found")
- Conditionally allow, based on the result of parsing the response.
Sometime, however, it would seem more sensible to back-off to a previously cached response, when that option is available. Examples where this behaviour would be desirable include:
- Rate limiting responses; such as 420 "Enhance your calm", 429 "Too Many Requests", 509 "Bandwidth Limit Exceeded", and 598/8 "Netowork timeout error"
- Cache-control responses: 304 "Not Modified"
- Temporary errors: 408 "Request Timeout"
This would nice to have, but it's probably very important. I imagine these conditions don't occur very often.
Currently, when a
robots.txtis retrieved there are three possible outcomes:Sometime, however, it would seem more sensible to back-off to a previously cached response, when that option is available. Examples where this behaviour would be desirable include:
This would nice to have, but it's probably very important. I imagine these conditions don't occur very often.