Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html-to-epub3 should output XHTML files with <!DOCTYPE html> #77

Open
bertfrees opened this issue Mar 6, 2024 · 4 comments
Open

html-to-epub3 should output XHTML files with <!DOCTYPE html> #77

bertfrees opened this issue Mar 6, 2024 · 4 comments
Assignees
Milestone

Comments

@bertfrees
Copy link
Member

Issue reported by Tom McCartney:

In running ePubCheck, I've been getting a recurring error in every project that the DocType was invalid and that the document was "Not well formed' for the actual content file. I've been providing a full XHTML Transitional DocType, complete with name and dtd reference in the source content that I have, and then using the html-to-epub3 conversion to create the actual ePub content. When I run ePubCheck, it's not happy with the "Transitional" DocType. After reading up on DocTypes again (I haven't looked at the specs on that particular item in quite a while) I see that for HTML 5, the valid DocType is simply ! which looked completely wrong to me without the name and DTD. I see now that it's valid, and I'll try to figure out how to get my XSLT to output an HTML 5 DocType element. But it seems odd to me that the html-to-epub3 will allow the Transitional XHTML, and it looks like the ePub Spec simply specifies valid XHTML, but ePubChec explicitly wants HTML 5. I would argue that one or the other of those should change so that both give the same answer one way or the other. But that's part of the question that I have here.

@bertfrees bertfrees changed the title html-to-epub3 should output XHTML files with `<!DOCTYPE html>' html-to-epub3 should output XHTML files with <!DOCTYPE html> Mar 6, 2024
@GrayWolfMT
Copy link

This may be a Windows related issue - in looking at the output of both the html-to-epub3 and epub3-to-epub3 jobs used to convert from HTML through to ePub with Audio, I am seeing a couple of "Error" messages at the end of the output, but the resulting ePub is successfully produced. The error is " SetDoctype: px:set-doctype failed to read from ..." and then lists a full path to a file (I'll include screenshots from each step.)

I was able to check and the files actually exist in the location used by the ePub Enhancement script, so the path is correct. I was unable to find the folder referenced by the HTML to ePub script, since the job was
removed after processing.

EpubEnhancementMsg
HtmlToEpubMsg

@bertfrees
Copy link
Member Author

@GrayWolfMT It looks like it might be a Windows issue indeed. I'm going to test it on a VM. If it isn't too much work for you, you may already send me the full log of the conversion.

@bertfrees
Copy link
Member Author

I can't reproduce the "SetDoctype" issue, perhaps because I'm not using the same input file and job options as you are.

But anyway, I don't think the fact that the doctype is not set to <!DOCTYPE html> is specific to Windows after all. It seems this just isn't something that Pipeline does at the moment.

@bertfrees bertfrees self-assigned this Mar 28, 2024
@bertfrees bertfrees added this to the v1.14.18 milestone Mar 28, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Pipeline 2 | April 2024 release Apr 3, 2024
@bertfrees bertfrees reopened this Apr 3, 2024
@bertfrees bertfrees reopened this Apr 12, 2024
@bertfrees
Copy link
Member Author

I looked too fast. It seems we do something, except not in a place I expected: 7870868.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

2 participants