Skip to content

Caching

4poc edited this page Feb 17, 2011 · 3 revisions

Feedability caches all retrieved pages and the extracted articles to improve the speed of serving the full-text feed to the clients. The default caching directory is ./cache, specified with the settings key cache.path. Within that a directory named after the domain of the original url is created. The base name of the cache files is the SHA-1 hash of the item url that was parsed from the feed. The following cache files are kept:

.json

Stores meta information about the article page: The original url that was used within the feed and the article url (it is not necessarily the same url: feedproxy etc.). The other informations are currently unused. Example:

{
  url: "http://example.com/news/131",
  orig_url: "http://example.com/news/131/from/atom",
  domain: "example.com",
  length: 123456,
  date: "Mon Feb 14 2011 01:53:09 GMT+0100 (CET)"
}

.raw

Store the raw article page without any manipulation as it was retrieved from the server. It is particularly useful if just readability or your jquery selector filters are changed or updated.

.rdby

Store the article text that readability has extracted.

Clone this wiki locally