Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torrent feature #35

Open
evrial opened this issue Jan 20, 2024 · 11 comments
Open

Torrent feature #35

evrial opened this issue Jan 20, 2024 · 11 comments

Comments

@evrial
Copy link

evrial commented Jan 20, 2024

Torrent feature would be dope, just not to bash the single server and be more responsible
I think .torrent files in target location would be enough to simplify this

@DocDrydenn
Copy link
Collaborator

Agreed.

We did discuss this before and should probably think about it again since a lot of these ZIMs are pretty big in size... torrents are probably a better way to download these.

@jojo2357
Most, if not all, torrent clients utilize a "watch" folder of some kind. Providing the script that path to download the '.torrent' file to would work to initiate the download. One downside to this would be that it probably wouldn't be feasible for this script to monitor the download of the actual torrent.

Another issue would be dealing with the completed download of the torrent. The moving of the downloaded ZIM would probably have to be put on the end user to deal with... i.e. most torrent clients allow scripts to be run when a download completes.

@jojo2357
Copy link
Owner

It would be quite easy to just place .torrent files in the zim dir, but then you lose a lot of the functionality of the script itself.

I just dont know how to do torrents automatically. There are so many clients, so the only thing that I could do would be add the .torrent to a watch folder. But then how does the script run from there? It needs to know when the dl is complete in order to purge and/or calculate the checksums.

Perhaps a torrent option (as above) would work well in tandem with the min and max size options? So you can dl the small ones via https and then rerun and get the larger ones via torrent?

If you can demonstrate how to download a file via torrent from the command line that is mostly client-agnostic, then I would like to proceed that route. Otherwise it will be a semi-automated inserting of torrent files into a designated folder.

@evrial
Copy link
Author

evrial commented Jan 20, 2024

I guess people are smart enough who use torrents so only download torrent files and exit

@metametapod
Copy link

metametapod commented May 28, 2024

@jojo2357
Copy link
Owner

Not sure what you mean. The wiki data is 3 years out of date https://en.wikipedia-on-ipfs.org/wiki/#distribution-footer

@Jaifroid
Copy link
Collaborator

Most torrent downloaders have SHA-based verification built in, so I think it would be enough to download the torrent file, as an option, say, for archives larger than X GB (2 GB?). The mirrorbrain software makes it very easy to get a torrent file (or a magnet link, but the former is most useful).

Probably easiest is that if the script is in "torrent" mode, it would never delete the original ZIM file, and would only download the latest torrent file for the update.

However, there's quite a lot of extra logic required to do this, so it might not be straightforward, and could end up getting messy.

@evrial
Copy link
Author

evrial commented May 29, 2024

However, there's quite a lot of extra logic required to do this, so it might not be straightforward, and could end up getting messy.

Why? You go to https://download.kiwix.org/zim/wikipedia/ select any url and add .torrent to url and done.

@metametapod
Copy link

Not sure what you mean. The wiki data is 3 years out of date https://en.wikipedia-on-ipfs.org/wiki/#distribution-footer

Nevermind, sorry about that. Doesn't look like there's a recent dataset.

This isn't client-agnostic but it may be more straightforward to implement support for aria2c, which allows downloading from both protocols at once:

aria2c "https://download.kiwix.org/zim/wikipedia/Wikipedia_en_computer_maxi_2024-05.zim.torrent" "https://download.kiwix.org/zim/wikipedia/Wikipedia_en_computer_maxi_2024-05.zim" --follow-torrent=true

@jojo2357
Copy link
Owner

Why? You go to https://download.kiwix.org/zim/wikipedia/ select any url and add .torrent to url and done.

I think what was implied is the logic to make that fit into the existing script. You are right to say that getting the torrent is very easy, but as previously mentioned, the idea is that the torrenting would happen in the user's preferred software and not thru kzu.

Some features afforded by using a torrent just do not make sense for a CLI app, like pausing and resuming a download on-demand, or

@Jaifroid is saying to add a new flag, which would need to interact with the -x flag, and then also need to integrate with all of the other options since we are using a totally different pathway.

I would be interested in feedback on using aria2c as @metametapod mentioned. I am not in any rush to stop using wget as it is a tried and true utility, but feedback is always appreciated.

@evrial
Copy link
Author

evrial commented May 30, 2024

Semi-automatic flag feature would be sufficient. Simply download .torrent instead of .zim, why looking for more complex solution?

@Jaifroid
Copy link
Collaborator

@jojo2357 Thanks for the clarification, and that's indeed what I meant. @evrial that's also what I meant -- simply download .torrent instead of ZIM if a flag is set (and probably for ZIMs larger than a specified size).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants