Return only fragment of page #63

dissolve · 2017-05-13T18:30:32Z

Had the idea of parsing a page and pulling out only a specific comment and how would that work (assuming it isn't posted from somewhere else). The idea would be to give a URL that has a fragment and the result items would contain anything from that id and below.

Would have to look at how this would work exactly, would likely need the whole page for rels and base and such.

aaronpk · 2017-05-24T14:38:14Z

I chose to do this as part of my Microformats consuming code, XRay, rather than at the parser level. XRay first parses the HTML document to extract the node at the matching fragment, then it passes that HTML to the parser.

dissolve · 2017-05-24T14:56:56Z

Doesn't this break things like tags? or do you just include the header?

aaronpk · 2017-05-24T15:03:41Z

Not sure what you mean "things like tags". Here's what it does: https://github.com/aaronpk/XRay/blob/master/lib/XRay/Formats/HTML.php#L82

Basically if a fragment is included, it runs $doc->saveHTML on that element and replaces the HTML that it fetched with the HTML from inside the HTML tag with that ID.

dissolve · 2017-05-24T15:20:32Z

lol... well then, github processes this as html.... things like <base> tags

dissolve added the enhancement label May 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return only fragment of page #63

Return only fragment of page #63

dissolve commented May 13, 2017

aaronpk commented May 24, 2017

dissolve commented May 24, 2017

aaronpk commented May 24, 2017

dissolve commented May 24, 2017 •

edited

Loading

Return only fragment of page #63

Return only fragment of page #63

Comments

dissolve commented May 13, 2017

aaronpk commented May 24, 2017

dissolve commented May 24, 2017

aaronpk commented May 24, 2017

dissolve commented May 24, 2017 • edited Loading

dissolve commented May 24, 2017 •

edited

Loading