Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return only fragment of page #63

Open
dissolve opened this issue May 13, 2017 · 4 comments
Open

Return only fragment of page #63

dissolve opened this issue May 13, 2017 · 4 comments

Comments

@dissolve
Copy link
Collaborator

Had the idea of parsing a page and pulling out only a specific comment and how would that work (assuming it isn't posted from somewhere else). The idea would be to give a URL that has a fragment and the result items would contain anything from that id and below.

Would have to look at how this would work exactly, would likely need the whole page for rels and base and such.

@aaronpk
Copy link
Member

aaronpk commented May 24, 2017

I chose to do this as part of my Microformats consuming code, XRay, rather than at the parser level. XRay first parses the HTML document to extract the node at the matching fragment, then it passes that HTML to the parser.

@dissolve
Copy link
Collaborator Author

Doesn't this break things like tags? or do you just include the header?

@aaronpk
Copy link
Member

aaronpk commented May 24, 2017

Not sure what you mean "things like tags". Here's what it does: https://github.com/aaronpk/XRay/blob/master/lib/XRay/Formats/HTML.php#L82

Basically if a fragment is included, it runs $doc->saveHTML on that element and replaces the HTML that it fetched with the HTML from inside the HTML tag with that ID.

@dissolve
Copy link
Collaborator Author

dissolve commented May 24, 2017

lol... well then, github processes this as html.... things like <base> tags

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants