Atom/RSS feed paging #152

jameysharp · 2018-06-27T19:33:35Z

RFC5005 is a standard that was published in 2007 for "Feed Paging and Archiving" for Atom and RSS. I'd like to see Granary support paging of those output types using this standard. Ideally, it would also be able to consume paged feeds and convey the paging information in all its output formats as well.

There are three major sections in this standard, not counting introductory and supplemental material.

Section 2, "Complete Feeds", just adds an empty <fh:complete/> tag to indicate that the contents of this feed document represent the complete history. If you can tell that the data you've consumed from your upstream source is complete (there are no earlier or later pages for this query) then you should add this tag to the generated RSS or Atom feed.

Section 3, "Paged Feeds", is useful when you don't know how many entries the query could return, or if there could be an infinite sequence of results. I haven't looked much at this section because for my use cases I've only cared about collections where I want to fetch all pages, where section 4 is more efficient. But section 3 lets you provide a simple cursor interface to clients, which I think is a good fit for what you're doing. Ideally, you'd also support consuming paged feeds and exposing the upstream cursor somehow in the various output formats, but it sounds like that's a longer-term project?

Section 4, "Archived Feeds", is semantically kind of a combination of sections 2 and 3. It indicates that if you fetch all the pages of the feed, then you will have the complete history of the feed. But there are some details specified for efficiency that I think make this section complicated for Granary. The archived feed page served at a particular URL may be treated by clients as if it has a far-future Expires header, so if old entries are inserted, deleted, or edited, then the URL needs to be changed before clients are guaranteed to pick it up. Also, the same entry may appear in multiple feed documents, in which case only the copy from the most recent page is supposed to be used.

So it's not clear to me that Granary can do anything with section 4 archived feeds except pass them through when converting between RSS and Atom, or something like that. But maybe there's some API you consume that turns out to be a good fit for that paging model, I don't know.

I'm guessing that sections 2 and 3 are easy to implement, though, and I'd love to see that happen!

The text was updated successfully, but these errors were encountered:

snarfed · 2018-06-28T14:43:38Z

thanks for filing, and for all the details! and great to meet you! this definitely makes sense. i'd happily merge a PR for this, or maybe even implement it myself when it bubbles up my todo list. :P

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atom/RSS feed paging #152

Atom/RSS feed paging #152

jameysharp commented Jun 27, 2018

snarfed commented Jun 28, 2018

Atom/RSS feed paging #152

Atom/RSS feed paging #152

Comments

jameysharp commented Jun 27, 2018

snarfed commented Jun 28, 2018