-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capturing unneeded elements #42
Comments
basically none of them worked for me. Here is link https://www.scribblehub.com/read/7681-the-mage-emperor/chapter/7683/ this is the epub |
Ok it seems the bug only exists in firefox extension. Chrome gave the epub like it's suppose to. |
A method to actually select the html element to capture would be nice. It would be great in cases where the single html element spans muliple webpages in which case it's not possible to select all the text at once. Go to 24symbols.com and try a free book for example. The save page option can capture a chapter almost perfectly, save for an unwanted footer at the end of each chapter (which is still great because it's actually inside an iframe and the footers can be removed easily afterwards). But the save selection method fails spectacularly in this case (chapter spans multiple pages even though the entire chapter gets loaded in each page). On a side note, I'm really grateful if you can answer this question. How is 24symbols preventing us from accessing the page source of the webpages of the books? (what it gives is completely a different page source) Ok here is a webpage I saved from 24symbols (with SingleFile plugin), The book was just something that used as guinea pig I still don't have any idea what's it about! |
This plugin usually captures some unneeded elements for me so far. For example it works terrible with scribblehub. If there is even one comment in the comment section the main content is completely ignored and only the comments are captured. When that happens epub starts with the string "Error: Parse Error:".
Even when the main content is captured there are some unneeded elements capture both before and after the necessary content. It would be nice if we can specify which elements are going to be captured or not, preferably by using Xpath expression of the needed elements.
The text was updated successfully, but these errors were encountered: