-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zimdump fails on long URLs #213
Comments
I propose to:
|
Care must be taken when truncating directories.
Both first directories must correctly truncated to the same "short name" ("long_directory~1") but the second must be different ("long_directory~2") |
Is there an example zim file available that has this problem? |
Please go ahead with this one that is 3GB: https://www.transfernow.net/dl/20231009UhHnE3Sy |
We should probably consider at the same time to ignore / replace all characters that are not allowed / interpreted differently on the target filesystem, this is causing many files to not be dumped. |
Building a very small ZIM with many "strange" ZIM paths is probably the way to go, quite easy to do with python-libzim or python-scraperlib. This would make testing the change on many filesystems much easier. |
Indeed but I'd like to mention that filesystems limitations are all properly documented. It should be designed with those limitations in mind as testing on various filesystems is cumbersome. |
here's the tail of the output of
zimdump dump --dir /data/mqc somezim.zim
:Touching that file fails as well
Very long URLs seems like a common use case and I believe it calls for a design change in the way those are written to disk.
Might be related to #190
Note: this is zimdump 2.1.0
The text was updated successfully, but these errors were encountered: