Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1454 xenforo batch post dl #1524

Merged

Conversation

Kiradien
Copy link
Collaborator

New Manual parser for xenforo forums; downloads all posts across multiple pages - resolves ticket #1454

Created Xenforo Batch Post Parser.

New Manual parser for xenforo forums; downloads all posts across multiple pages

Update XenforoBatchParser.js

Removed excess comments
@Kiradien Kiradien force-pushed the #1454-Xenforo_Batch_Post_DL branch from de4bf40 to 492f6d6 Compare September 28, 2024 15:07
@Kiradien Kiradien merged commit ad728f3 into dteviot:ExperimentalTabMode Sep 28, 2024
1 check passed
@dteviot
Copy link
Owner

dteviot commented Sep 28, 2024

@Kiradien

You don't need to do this:

        let chapters = [...Array(pageCount).keys()].map(index => { 
            var link = dom.createElement("a");
            link.setAttribute("href", `${baseURI}${pagingUriComponent}${index}`); //ChangeMe!
            link.appendChild(dom.createTextNode(`Page ${index + 1}`));
            return link;
        });
        return chapters.map(a => util.hyperLinkToChapter(a));

Something like this should work

        return [...Array(pageCount).keys()].map(index => ({ 
            sourceUrl: `${baseURI}${pagingUriComponent}${index}`,
            title: `Page ${index + 1}`
        }));

Also, should the //ChangeMe be there?

Have a look at splitContentIntoEpubItems(). This is used to convert single web page from the source into multiple chapters in the Epub.

@Kiradien
Copy link
Collaborator Author

Kiradien commented Sep 29, 2024

Also, should the //ChangeMe be there?

No, no it really shouldn't. That was supposed to be a reminder for me to go back to that section and basically dig around to figure out exactly what you suggested as a cleaner solution, and I completely forgot about it. I'll make a new pull request and leave it open for suggestions/improvements for a day or so. Thank you for the much cleaner code.

As for splitContentIntoEpubItems() - I'll look into it once I've got a bit more willpower to look into this again - though admittedly, the current version may need to stand as a superior option in the long term. One of the threads that I used to test this (and one of the main reasons I wanted to get this working) actually had over 700 pages of posts at 25 posts per page - so well over 10000 'chapters' when converted that way... As you can imagine, with that many super short 'chapters', things got somewhat weird.

@gamebeaker
Copy link
Collaborator

@dteviot Can't 10000 be changed to 99999?
I think the limit only has something todo with the naming scheme.

@dteviot
Copy link
Owner

dteviot commented Sep 29, 2024

@gamebeaker
99999 is definitely too big. An epub is a zip file, and the (old) zip spec is maximum of 64k files in a zip. I think I've seen something in jszip that sets maximum to 16k. Also, have you ever tried to use the ToC for an epub with 10k chapters?

@gamebeaker
Copy link
Collaborator

gamebeaker commented Sep 29, 2024

@dteviot no i haven't tried i thought that 10k was completely arbitrary.
I think the largest book i created had ~4k chapter.

@Kiradien
Copy link
Collaborator Author

Before playing around with this, I thought 10k was arbitrary as well; issues cropped up with time actually rendering the epub - it took over a minute to open the file - but the real nightmare was debugging. I'm pretty sure something went wrong in the javascript arrays somewhere along the line as the chapters came out in the wrong order with some repeats...

I admit it might've been the code, but the outcome was pretty random/screwy when the exact same code appeared to work fine for fewer pages. Normally debugging isn't an issue, but it's a bit of a nightmare to debug an object containing >10k chapters. Just gonna say that erasing everything and reverting to this version was extremely satisfying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants