Skip to content
Henryk Paluch edited this page Apr 13, 2023 · 4 revisions

Linux exFAT traps

If you plan to use exFAT on Linux be beware of critical issue:

You may happily create invalid Unicode filename, but the exFAT driver will crash everytime you access it.

What is even worse - fsck.exfat will not catch such problem and thus will not help.

There is similar bug reported on:

Debugging

I have LUKS partition with exFat for backups. So I first map that partition as device using:

cryptsetup luksOpen /dev/sdd1 myluks
# this creates device /dev/mapper/myluks

Now I can use regular fsck.exfat to test for problems:

fsck.exfat -rv /dev/mapper/myluks 

exfatprogs version : 1.0.4
volume label [WD_LUKS]
sector size:  512.00 B
cluster size: 128.00 KB
volume size:  1023.99 GB
/dev/mapper/myluks: clean. directories 3149, files 19715

As you can see everything is OK and shiny. Now we will mount it in read-only and debug mode:

mount.exfat-fuse -d -o ro /dev/mapper/myluks /mnt/test/
# it will stay attached to terminal

In another terminal I will just run find to scan all directories:

find /mnt/test/
# suddenly there will be errors:
find: '/mnt/test/some_path': Software caused connection abort
find: '/mnt/test/some_path': Transport endpoint is not connected
...
find: failed to read file names from file system at or below ‘/mnt/test/’: Transport endpoint is not connected

On terminal with mount.fuse-exfat we will see:

LOOKUP /some_path
getattr /some_path
   NODEID: 443
   unique: 4478, success, outsize: 144
unique: 4480, opcode: OPENDIR (27), nodeid: 443, insize: 48, pid: 6552
   unique: 4480, success, outsize: 32
unique: 4482, opcode: READDIR (28), nodeid: 443, insize: 80, pid: 6552
readdir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.
Aborted (core dumped)

To try again we have to first force un-mount filesystem:

umount -f /mnt/test

Related packages:

$ rpm -qf /usr/sbin/fsck.exfat 

exfatprogs-1.0.4-150300.3.6.1.x86_64

$ rpm -qf /sbin/mount.exfat-fuse 
fuse-exfat-1.3.0-bp154.1.20.x86_64

Now we have to install debuginfo packages using:

zypper --plus-content debug in exfatprogs-debuginfo fuse-exfat-debuginfo libfuse2-debuginfo

Also install GDB:

zypper in gdb

This time we will use mount.exfat-fuse command again, but rather in GDB:

gdb /sbin/mount.exfat-fuse

(gdb) run -d -o ro /dev/mapper/myluks /mnt/test/

Again in another terminal invoke:

find /mnt/test/

Now you will find that GDB catched abort() call (so find just hang, but does not report yet):

eaddir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.

Program received signal SIGABRT, Aborted.
0x00007ffff7b95c6b in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install libfuse2-debuginfo-2.9.7-3.3.1.x86_64

# we can try backtrace:
(gdb) bt
#0  0x00007ffff7b95c6b in raise () from /lib64/libc.so.6
#1  0x00007ffff7b97305 in abort () from /lib64/libc.so.6
#2  0x0000555555403e89 in exfat_bug (format=format@entry=0x5555554096a8 "failed to convert name to UTF-8")
    at log.c:58
#3  0x0000555555408054 in exfat_get_name (node=node@entry=0x5555558addf0, 
    buffer=buffer@entry=0x7fffffffda00 "(Precko") at utils.c:53
#4  0x0000555555402363 in fuse_exfat_readdir (path=<optimized out>, buffer=0x55555560e4b0, 
    filler=0x7ffff7d6dc60 <fill_dir>, offset=<optimized out>, fi=<optimized out>) at main.c:131
#5  0x00007ffff7d73287 in fuse_fs_readdir (fs=0x55555560f010, 
    path=0x55555562f660 "/PATH/dane/obecne", buf=0x55555560e4b0, 
    filler=0x7ffff7d6dc60 <fill_dir>, off=0, fi=0x7fffffffddd0) at fuse.c:2009
#6  0x00007ffff7d73448 in readdir_fill (fi=0x7fffffffddd0, dh=0x55555560e4b0, off=0, size=4096, ino=443, 
    req=0x55555562f9b0, f=0x55555560eeb0) at fuse.c:3467
#7  fuse_lib_readdir (req=0x55555562f9b0, ino=443, size=4096, off=0, llfi=<optimized out>) at fuse.c:3493
#8  0x00007ffff7d79f72 in do_readdir (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>)
    at fuse_lowlevel.c:1390
#9  0x00007ffff7d7b101 in fuse_ll_process_buf (data=0x55555560f1a0, buf=0x7fffffffe050, ch=<optimized out>)
    at fuse_lowlevel.c:2443
#10 0x00007ffff7d779af in fuse_session_loop (se=se@entry=0x55555560e700) at fuse_loop.c:40
#11 0x00007ffff7d6fcd8 in fuse_loop (f=f@entry=0x55555560eeb0) at fuse.c:4322
#12 0x00007ffff7d8024c in fuse_main_common (argc=argc@entry=5, argv=argv@entry=0x7fffffffe1d0, 
    op=op@entry=0x55555560c1e0 <fuse_exfat_ops>, op_size=op_size@entry=360, user_data=user_data@entry=0x0, 
    compat=compat@entry=25) at helper.c:371
--Type <RET> for more, q to quit, c to continue without paging--
#13 0x00007ffff7d802fe in fuse_main_real_compat25 (argc=argc@entry=5, argv=argv@entry=0x7fffffffe1d0, 
    op=op@entry=0x55555560c1e0 <fuse_exfat_ops>, op_size=op_size@entry=360) at helper.c:476
#14 0x0000555555401d43 in fuse_exfat_main (mount_point=0x7fffffffe711 "/mnt/test/", 
    mount_options=0x55555560e330 "allow_other,big_writes,blkdev,default_permissions,debug,fsname=/dev/mapper/myluks,ro,blksize=4096") at main.c:511
#15 main (argc=<optimized out>, argv=<optimized out>) at main.c:603

We can already see problematic path frame #5:

#5  0x00007ffff7d73287 in fuse_fs_readdir (fs=0x55555560f010, 
    path=0x55555562f660 "/PATH/dane/obecne", buf=0x55555560e4b0, 

To see part exactly what filename caused crash we have to look into frame 3:

(gdb) frame level 3
#3  0x0000555555408054 in exfat_get_name (node=node@entry=0x5555558addf0, 
    buffer=buffer@entry=0x7fffffffda00 "(Precko") at utils.c:53
53	in utils.c

Let's try first simple command:

gdb) print node->name
$13 = {{__u16 = 40}, {__u16 = 80}, {__u16 = 114}, {__u16 = 55981}, {__u16 = 121}, {__u16 = 32}, {
    __u16 = 102}, {__u16 = 121}, {__u16 = 122}, {__u16 = 105}, {__u16 = 99}, {__u16 = 107}, {
...

To make it more readable (for later analysis) we can print them as 16-bit hex numbers:

(gdb) x/16xh node->name
0x5555558ade48:	0x0028	0x0050	0x0072	0xdaad	0x0079	0x0020	0x0066	0x0079
0x5555558ade58:	0x007a	0x0069	0x0063	0x006b	0xd842	0xdff3	0x006f	0x0062

We can clearly see that there are 2 ASCII characters (0x28 and 0x50) followed by some weird Unicode character 0xdaad)

To print Unicode (and potentially malformed name) we can use trick from - https://stackoverflow.com/questions/39141801/how-to-print-unicode-string-in-gdb-when-debugging-in-windows

(gdb) x/sh node->name
0x5555558ade48:	u"(Pr\xdaady fyzick.....poradcu).pdf"

(Unprintable mess replaced with dots)

If you want to see details of node variable we can follow another trick from https://stackoverflow.com/questions/1768620/how-do-i-show-what-fields-a-struct-has-in-gdb

(gdb) ptype /o node
type = const struct exfat_node {
/*      0      |       8 */    struct exfat_node *parent;
/*      8      |       8 */    struct exfat_node *child;
/*     16      |       8 */    struct exfat_node *next;
/*     24      |       8 */    struct exfat_node *prev;
/*     32      |       4 */    int references;
/*     36      |       4 */    uint32_t fptr_index;
/*     40      |       4 */    cluster_t fptr_cluster;
/* XXX  4-byte hole      */
.....
*     64      |       8 */    uint64_t size;
/*     72      |       8 */    time_t mtime;
/*     80      |       8 */    time_t atime;
/*     88      |     512 */    le16_t name[256];

So now we know what filename caused crash of fuse-exfat. The question is, how to fix it? (Some people on Internet advice to use Windows machine, but there is currently not good LUKS compatible driver for Windows - all known projects has been abandoned).

To see how looks source we can try:

zypper si fuse-exfat
rpmbuild -bp /usr/src/packages/SPECS/fuse-exfat.spec
less /usr/src/packages/BUILD/fuse-exfat-1.3.0/libexfat/utils.c

Important sources:

void exfat_get_name(const struct exfat_node* node,
                char buffer[EXFAT_UTF8_NAME_BUFFER_MAX])
{
        if (utf16_to_utf8(buffer, node->name, EXFAT_UTF8_NAME_BUFFER_MAX,
                                EXFAT_NAME_MAX) != 0)
                exfat_bug("failed to convert name to UTF-8");
}

Finally in /usr/src/packages/BUILD/fuse-exfat-1.3.0/libexfat/utf.c we can see:

int utf16_to_utf8(char* output, const le16_t* input, size_t outsize,
		size_t insize)
{
	const le16_t* inp = input;
	char* outp = output;
	wchar_t wc;

	while (inp - input < insize)
	{
		inp = utf16_to_wchar(inp, &wc, insize - (inp - input));
		if (inp == NULL)
		{
			exfat_error("illegal UTF-16 sequence");
			return -EILSEQ;
		}
// ...
     }
// ...
}

// and also
static const le16_t* utf16_to_wchar(const le16_t* input, wchar_t* wc,
		size_t insize)
{
	if ((le16_to_cpu(input[0]) & 0xfc00) == 0xd800)
	{
		if (insize < 2 || (le16_to_cpu(input[1]) & 0xfc00) != 0xdc00)
			return NULL;
		*wc = ((wchar_t) (le16_to_cpu(input[0]) & 0x3ff) << 10);
		*wc |= (le16_to_cpu(input[1]) & 0x3ff);
		*wc += 0x10000;
		return input + 2;
	}
	else
	{
		*wc = le16_to_cpu(*input);
		return input + 1;
	}
}

So there is no way to handle this case gracefully...

How to reproduce

Tested on this environment:

$ cat /etc/SUSE-brand 

openSUSE
VERSION = 15.4

Install these packages:

$ sudo zypper in exfatprogs fuse-exfat libfuse2

Tested versions:

$ rpm -q exfatprogs fuse-exfat libfuse2

exfatprogs-1.0.4-150300.3.6.1.x86_64
fuse-exfat-1.3.0-bp154.1.20.x86_64
libfuse2-2.9.7-3.3.1.x86_64

How to reproduce:

# 1st terminal:
sudo mkdir -p /mnt/test
rm -f ~/exfat.img
cd
truncate -s 128M ~/exfat.img
/usr/sbin/mkfs.exfat -L EXFAT_BUG ~/exfat.img
loop_dev=`sudo /sbin/losetup -f --show ~/exfat.img`
sudo /usr/sbin/mount.exfat -d $loop_dev /mnt/test

# will print something like:
FUSE exfat 1.3.0
FUSE library version: 2.9.7
nullpath_ok: 0
...

Now open another terminal session and run:

# 2nd terminal:
sudo mkdir /mnt/test/bad`echo -ne '\xed\xaa\xad'`

Now unmount and mount again exfat to force reload of Unicode filenames from disk:

# 2nd terminal:
sudo umount /mnt/test

Back on 1st terminal mount that filesystem again:

# 1st terminal:
sudo /usr/sbin/mount.exfat -d $loop_dev /mnt/test

And on 2nd terminal:

# 2nd terminal:
ls -l /mnt/test/

ls: reading directory '/mnt/test/': Software caused connection abort

Ooops, fatal error, you can see on 1st terminal that process was abort()-ted with

readdir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.
Aborted

Tip: before remount you need to do force unmount:

# 1st terminal:
sudo umount -f /mnt/test

Now we are in serious trouble: everytime some process access our filename the fuse-exfat will crash leaving mounted filesystem stuck and inaccessible...

Zip archive with weird encoding

Recently I got ZIP file with really weird encoding.

Using this command:

unzip -l archive.zip | od -c

Revealed really this 8-bit encoding:

0001140   3   7             203   e   s   t   n   e   p   r   o   h   l

We know that 0203 Octal (131 decimal, 0x83 hexa-decimal) should be Ccaron (e.g. Č). I was unable to find any suitable encoding in iconv -l output that would fit this definition.

Thanks to my early involvement in Linux (started with MCC around 1993, later switches to RedHat 4.0 around 1996) I recalled utility cstocs written by good Czech guy there:

131 Ccaron

So it is Cork encoding - originally used for TeX fonts.

So to convert it you can download yourself:

Please note that there is no easy way to unpack filenames from ZIP with re-encoding (without writing such program by yourself). However now we at least know what it has encoding. Some pointers:

Maybe modifying this Python script would help :-)

Resources

Clone this wiki locally