You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This could be handled either by altering traverse_text_fragments to get the tag's local name (using etree.QName), or adding a duplicate of each tag to the NEWLINE_TAGS set that has {http://www.w3.org/1999/xhtml} prepended.
After the failure of
extract_text
in #24, I triedetree_to_text
.I got through that without encountering an exception, but
guess_layout
doesn't work: no newlines are added after those tags.I think it's because
element.tag
includes the tag's XML namespace, so it doesn't match the namespacelessNEWLINE_TAGS
andDOUBLE_NEWLINE_TAGS
.Test:
The text was updated successfully, but these errors were encountered: