-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parser for mtgstory.com #1500
Add parser for mtgstory.com #1500
Conversation
For some reason, gitignore was set to ignore additions to the parsers folder, so I have commented that line. |
@Darthagnon can you fix the eslint errors? (just push to your branch i think it should update in the merge request) |
Replacing findCoverImageUrl(dom) {
// Try to find an image inside the '.swiper-slide' or inside an 'article'
let imgElement = dom.querySelector(".swiper-slide img, article img");
// If an image is found, return its 'src' attribute
if (imgElement) {
return imgElement.getAttribute("src");
// Check if the URL starts with '//' (protocol-relative URL)
if (imgSrc && imgSrc.startsWith("//")) {
// Add 'https:' to the start of the URL
imgSrc = "https:" + imgSrc;
}
}
// Fallback if no image was found
return null;
} with findCoverImageUrl(dom) {
return util.getFirstImgSrc(dom, ".swiper-slide img, article img");
} will fix your problems. |
You can just remove that line. |
Commented out line 5 //parserFactory.register("mtglore.com", () => new MagicWizardsParser()); should be removed. This if (authorPattern.test(href)) {
return true;
} else {
return false;
} should be return authorPattern.test(href)); I'm not convinced that if (window.location.hostname.includes("web.archive.org")) does what you think it does. |
@Darthagnon Maybe you can change .gitignore to ignore the new files if someone does |
This change breaks the parser, results in it being unable to pick up any chapters.
I'm not too sure what to do about the spacing/lint errors in Some test pages:
I believe the archive.org logic may be needed to account for slight variations in the article selectors over time, but I will keep testing. |
Hmmm... definitely WIP, I need to do some more work on it. |
@Darthagnon The spacing error message comes from |
Improves compatibility with 2016 version of site
Also add TODO to JS. 2024 site and pre-2018 site work and are priority, as they cover all modern stories and older lost chapters. (Ancient MTG articles from pre-2014 not accounted for yet)
Ongoing improvements mean the script now deals quite well with both the 2023-2024 version and 2014-2018 version of the website (v0.72, chapter titles now generalised and correctly selected). |
return authorPattern.test(href)); D'oh! Copy/paste mistake on my part. Should only be one closing bracket. i.e. return authorPattern.test(href); |
@Gamebreaker No idea where Everything search for |
packed.js is created when the build runs and creates the WebToEpub extension. As you're not running the build, you won't see this file on your machine. I think the lines with the indentation problem are these: WebToEpub/plugin/js/parsers/MagicWizardsParser.js Lines 60 to 62 in fd8c87f
Should be titleElement = link.closest("article")?.querySelector(selector) ||
link.closest(".article-item")?.querySelector(selector) ||
link.closest(".details")?.querySelector(selector); The line following a line ending with a || should be indented 4 more spaces. |
@Darthagnon |
I'm wrong, @gamebeaker is correct. In my defense, it was hard to see the highlighted rows in his screenshot. WebToEpub/plugin/js/parsers/MagicWizardsParser.js Lines 27 to 33 in fd8c87f
|
@Darthagnon here is how you can run lint the first time, you need npm 2024-09-22.22-37-23.mp4 |
WIP. Add parser for mtgstory.com (redirects to https://magic.wizards.com/en/story). Seems to work on most versions of the website (e.g. current live version, archive.org version from 2-3 years ago, untested on 10 years ago archive.org version). Still missing fallback support for mtglore.com.
Based on MagicWizardsParser.js v0.6 from https://github.com/Darthagnon/web2epub-tidy-script