-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please add https://wizards.com sites for MTG stories #1300
Comments
I'm not quite sure what you're asking for. |
Apologies, my explanation was rather confusing. "Edit chapter URLs" is the view I use to collect chapter lists to convert to EPUB, because
Questions
we could have
... and the titles read according to the filter template to editable fields in the chapter list: Many thanks for any advice or help! |
I have started implementation here: https://github.com/Darthagnon/web2epub-tidy-script/blob/master/MagicWizardsParser.js |
yes. https://github.com/dteviot/WebToEpub/blob/ExperimentalTabMode/plugin/js/parsers/NoblemtlParser.js Although it's not obvious how it works. An example is findCoverImageUrl(dom) {
return util.getFirstImgSrc(dom, ".thumbook, .sertothumb");
} This looks for an image using two CSS selectors. ".thumbook" and ".sertothumb" and picks the first it finds. As the two sites have a different layout, only one will succeed. An alternate way to handle multiple sites, is the "dom" parameter holds the URL of the page in dom.baseURI. You could extract the hostname from the URL and then switch the logic based on that. That said, WebToEpub is supposed to check the URL for each page, and then select the appropriate parser even if the Table of Contents is a mixture of sites. So, you might not need a combined parser. Just write one for each site. |
fixed in #1500 |
I'm starting work on an archival project, to convert Magic: the Gathering web fiction to EPUB (here and on my HDD), as it is slowly disappearing from the website with slapdash updates. Web2Epub is the best tool for the job, and I have been using it successfully with the default parser using the following:
The story is spread across 4 different URL/article website structures, half of which are only on the Internet Archive. Different chapters can exist under different structures, and the TOCs (if they exists) are not comprehensive.
My workflow currently involves getting the Archive.org links for as many chapters on one website structure as possible, as mix-and-match of Includes and Excludes doesn't seem to work very well (? or maybe I should just use more commas), testing, then editing the chapter list and pasting in the links to what I actually want to download, e.g.
I am starting work on the parser, but was wondering if there was a way for it to target different sites, and to ignore the TOC, and only request a manual list? My workflow would be improved if there was just a box for URLs and it could extract the titles from that, rather than having to write an HTML chapter list with
a href="">Title here</a>
- is there a way to force this with a new parser?The text was updated successfully, but these errors were encountered: