Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing DOCTYPE support #16

Open
msva opened this issue Dec 29, 2019 · 0 comments
Open

Missing DOCTYPE support #16

msva opened this issue Dec 29, 2019 · 0 comments

Comments

@msva
Copy link

msva commented Dec 29, 2019

Hi there!
I've faced an issue that slaxdom failed to build a DOM for document that have a <!DOCTYPE>.
w3 says it is valid: https://www.w3.org/TR/xml/#NT-doctypedecl

Minimal test-case:

sd=require"slaxdom"
z=sd:dom("<!DOCTYPE><a></a>")

The error would be:

/usr/share/lua/5.1/slaxdom.lua:34: Document has non-whitespace text at root: '<!DOCTYPE>'
Stack traceback:
  At =[C]:-1 (in global error)
  At @/usr/share/lua/5.1/slaxdom.lua:34 (in field text)
    0031:               end,
    0032:               text = function(value,cdata)
    0033:                       -- documents may only have text node children that are whitespace: https://www.w3.org/TR/xml/#NT-Misc
    0034:                       if current.type=='document' and not value:find('^%s+$') then error(("Document has non-whitespace text at root: '%s'"):format(value:gsub('[\r\n\t]',{['\r']='\\r', ['\n']='\\n', ['\t']='\\t'}))) end
    0035:                       push(current.kids,{type='text',name='#text',cdata=cdata and true or nil,value=value,parent=rich and current or nil})
    0036:               end,
    0037:               comment = function(value)
  At @/usr/share/lua/5.1/slaxml.lua:87 (in upvalue finishText)
    0084:                               text = gsub(text,'%s+$','')
    0085:                               if #text==0 then text=nil end
    0086:                       end
    0087:                       if text then self._call.text(unescape(text),false) end
    0088:               end
    0089:       end
    0090:
  At @/usr/share/lua/5.1/slaxml.lua:125 (in local startElement)
    0122:               if first then
    0123:                       currentElement[2] = nil -- reset the nsURI, since this table is re-used
    0124:                       currentElement[3] = nil -- reset the nsPrefix, since this table is re-used
    0125:                       finishText()
    0126:                       pos = last+1
    0127:                       first,last,match2 = find(xml, '^:([%a_][%w_.-]*)', pos )
    0128:                       if first then
  At @/usr/share/lua/5.1/slaxml.lua:239 (in method parse)
    0236:       while pos<#xml do
    0237:               if state=="text" then
    0238:                       if not (findPI() or findComment() or findCDATA() or findElementClose()) then
    0239:                               if startElement() then
    0240:                                       state = "attributes"
    0241:                               else
    0242:                                       first, last = find( xml, '^[^<]+', pos )
  At @/usr/share/lua/5.1/slaxdom.lua:44 (in method dom)
    0041:                       push(current.kids,{type='pi',name=name,value=value,parent=rich and current or nil})
    0042:               end
    0043:       }
    0044:       builder:parse(xml,opts)
    0045:       return doc
    0046: end
    0047:
  At stdin#22:1 (in  ?)
    0001: z=sd:dom("<!DOCTYPE><a></a>")

For now, I working that around by converting doctype to comment, and "uncommenting" it again right before serialization, but it would be nice if it will work out-of-the box :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant