Capturing unneeded elements #42

Anangaya · 2020-11-09T20:28:23Z

This plugin usually captures some unneeded elements for me so far. For example it works terrible with scribblehub. If there is even one comment in the comment section the main content is completely ignored and only the comments are captured. When that happens epub starts with the string "Error: Parse Error:".

Even when the main content is captured there are some unneeded elements capture both before and after the necessary content. It would be nice if we can specify which elements are going to be captured or not, preferably by using Xpath expression of the needed elements.

alexadam · 2020-11-12T15:29:13Z

I can't reproduce it. Please send the link that's causing problems

Anangaya · 2020-11-13T02:09:04Z

basically none of them worked for me. Here is link

https://www.scribblehub.com/read/7681-the-mage-emperor/chapter/7683/

this is the epub

The Mage Emperor - Chapter 1 – My sexy childhood friend returned. And… Moved into our house! Scribble Hub.zip

Anangaya · 2020-11-13T02:15:57Z

Ok it seems the bug only exists in firefox extension. Chrome gave the epub like it's suppose to.

Anangaya · 2020-11-23T20:01:46Z

A method to actually select the html element to capture would be nice. It would be great in cases where the single html element spans muliple webpages in which case it's not possible to select all the text at once. Go to 24symbols.com and try a free book for example. The save page option can capture a chapter almost perfectly, save for an unwanted footer at the end of each chapter (which is still great because it's actually inside an iframe and the footers can be removed easily afterwards). But the save selection method fails spectacularly in this case (chapter spans multiple pages even though the entire chapter gets loaded in each page).

On a side note, I'm really grateful if you can answer this question. How is 24symbols preventing us from accessing the page source of the webpages of the books? (what it gives is completely a different page source)

Ok here is a webpage I saved from 24symbols (with SingleFile plugin),

Aftershock - A Stone Braide Chronicles Story by Bonnie S. Calhoun - Read book online (11_24_2020 12_25_01 PM).zip

The book was just something that used as guinea pig I still don't have any idea what's it about!
The page source can be viewed from this file. Which is not the case when I try it directly at the site.
The entire chapter is there in the page source but only a part of it's visible from the webpage thus it's impossible to select it all from Save Selection option.

alexadam self-assigned this Nov 12, 2020

alexadam added bug enhancement labels Nov 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capturing unneeded elements #42

Capturing unneeded elements #42

Anangaya commented Nov 9, 2020 •

edited

Loading

alexadam commented Nov 12, 2020

Anangaya commented Nov 13, 2020

Anangaya commented Nov 13, 2020

Anangaya commented Nov 23, 2020 •

edited

Loading

Capturing unneeded elements #42

Capturing unneeded elements #42

Comments

Anangaya commented Nov 9, 2020 • edited Loading

alexadam commented Nov 12, 2020

Anangaya commented Nov 13, 2020

Anangaya commented Nov 13, 2020

Anangaya commented Nov 23, 2020 • edited Loading

Anangaya commented Nov 9, 2020 •

edited

Loading

Anangaya commented Nov 23, 2020 •

edited

Loading