Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in load_list loading due to non-UTF8 filenames #62

Open
petoetje opened this issue Sep 13, 2020 · 5 comments
Open

Exception in load_list loading due to non-UTF8 filenames #62

petoetje opened this issue Sep 13, 2020 · 5 comments
Labels
enhancement New feature or request

Comments

@petoetje
Copy link

petoetje commented Sep 13, 2020

I get (replaced my real name with XXX)
2020-09-13 21:26:27,302: Exception in load_list loading </home/christo/.rclonesyncwd/[email protected]__Path2_NEW>:
<'utf-8' codec can't decode byte 0xe9 in position 5908: invalid continuation byte>
Line # 9922: 2689437 2020-09-12 23:43:27.922891265 photo/Goele/Goele_2015_06_28/Goele_2015_06_28 (97).JPG

2020-09-13 21:26:27,303: ERROR Failed loading current Path2 list file </home/christo/.rclonesyncwd/[email protected]__Path2_NEW> -

This with the version from trunk

@cjnaz
Copy link
Owner

cjnaz commented Sep 13, 2020

Run rclonesyncs with --verbose and post please.
Are you trying to sync to your email address? It that a configued remote on rclone? Does rclone lsl <that remote> show you your files?

<'utf-8' codec can't decode byte 0xe9 in position 5908: invalid continuation byte>
Line # 9922: 2689437 2020-09-12 23:43:27.922891265 photo/Goele/Goele_2015_06_28/Goele_2015_06_28 (97).JPG

This looks like a locale problem. Rclonesync is hard coded to utf8. What is the locale of your system?

@petoetje
Copy link
Author

  • rclone has remote "google". Local site = directory named after email address.
  • rclone lsl google: does show files and does not crash
  • locale is set to UTF-8
  • output ends with
    2020/09/14 09:32:53 NOTICE: Local file system at /home/christo/[email protected]: Replacing invalid UTF-8 characters in "storage/arcade/c64/arnold.c64.org/pub/games/w/Where_in_Time_is_Carmen_Sandiego.Br\xa2derbund.Mirage.zip"
    2020/09/14 09:33:12 NOTICE: Local file system at /home/christo/[email protected]: Replacing invalid UTF-8 characters in "extrastorage/cl/ludo/bloemkool velout\xe9 met gorgonzola.doc"
    2020-09-14 09:34:29,908: Exception in load_list loading </home/christo/.rclonesyncwd/[email protected]__Path2_NEW>:
    <'utf-8' codec can't decode byte 0xe9 in position 1454: invalid continuation byte>
    Line # 2380: 34218 2020-09-12 23:10:54.988105162 extrastorage/cl/NULL

2020-09-14 09:34:29,908: ERROR Failed loading current Path2 list file </home/christo/.rclonesyncwd/[email protected]__Path2_NEW> -
2020-09-14 09:34:30,042: Lock file removed: </tmp/[email protected]_>
2020-09-14 09:34:30,042: ***** Error Abort. Try running rclonesync again. *****

@cjnaz
Copy link
Owner

cjnaz commented Sep 14, 2020

rclone is finding and replacing invalid utf8 characters. I assume that the created lsl file thus has modified file names that differ from the actual files on the disk. Then, when rclonesync tries to read/load the LSL file it is also erroring for invalid utf8 characters - so maybe rclone isn't actually changing characters and the resultant LSL file is not valid utf8.

Are there a lot of these errors, or just the two?

If you are willing, please upload the entire console output from a rclonesync google: [email protected] --verbose --verbose --rc-verbose --rc-verbose and with --first-sync I assume, and whatever other switches you are using. Upload to https://drive.google.com/drive/folders/1FuHvtoezlesiK4btn0Jr8yhi4VQQ1xOr?usp=sharing. I will delete the files once received. Even better, create a directory with a few of the problem files and rclonesync just that folder (with the verbose switches), then upload that console log, the created LSL _Path1 and _Path2 files, and the directory itself.

rclonesync cannot just ignore files that are problems without breaking the integrity of the sync. Your options include

@petoetje
Copy link
Author

  • I do not get permission to upload to your shared Googledir.
  • Meanwhile I understand the problem. Indeed I seem to have files with names that are not properly UTF-8 encoded. These files also give problems with e.g. tab completion.
  • I prepared a debug directory, including a gzip of a directory with a bad file. Since there seems to be no personal info I attached it to this entry.
    debug.tar.gz
  • I believe there is not much that you can do. I have to think about a script that lists me the bad file names and then manually fix them.

@cjnaz
Copy link
Owner

cjnaz commented Sep 14, 2020

Thanks. In a future release I may be able to gracefully and safely ignore invalid filenames with just a warning message. Probably with a switch to enable this behavior.

I'll leave this issue open.

@cjnaz cjnaz changed the title Exception in load_list loading Exception in load_list loading due to non-UTF8 filenames Sep 17, 2020
@cjnaz cjnaz added the enhancement New feature or request label Sep 24, 2020
@ivandeex ivandeex mentioned this issue May 7, 2021
38 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants