You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first off, thanks a bunch for making this project available. It's exactly what I needed for a project of mine.
There doesn't seem to be a lot of development going on, but maybe this helps somebody with similar problems I had.
The parser seems to have issues when an element contains an attribute without value, or an attribute with an unquoted value (which is both valid HTML, AFAIK).
Examples:
Missing attribute value:
dom = htmldom.HtmlDom()
dom.createDom("<div><p foo class='bar'>hello world</p><p>bye</p></div>")
dom.find("p.bar") # returns an empty list
Unquoted attribute value:
dom.createDom("<div><p foo=1 class='bar'>hello world</p><p>bye</p></div>")
dom.find("p.bar") # returns an empty list
For my use-case I am currently working around this by retrieving the HTML source using requests, string-replacing the known offending attribute with an empty string and then feeding the result into createDom().
The text was updated successfully, but these errors were encountered:
Hey,
first off, thanks a bunch for making this project available. It's exactly what I needed for a project of mine.
There doesn't seem to be a lot of development going on, but maybe this helps somebody with similar problems I had.
The parser seems to have issues when an element contains an attribute without value, or an attribute with an unquoted value (which is both valid HTML, AFAIK).
Examples:
Missing attribute value:
Unquoted attribute value:
For my use-case I am currently working around this by retrieving the HTML source using
requests
, string-replacing the known offending attribute with an empty string and then feeding the result intocreateDom()
.The text was updated successfully, but these errors were encountered: