Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.10.0-rc1: imapd SIGABRTs inside printf_positional() when compiled with optimizations turned on (anything greater than -O0) #4972

Closed
mosvald opened this issue Jul 12, 2024 · 9 comments · Fixed by #4980
Assignees

Comments

@mosvald
Copy link
Contributor

mosvald commented Jul 12, 2024

This doesn't happen with 3.8.3. Happens on Fedora 39-41. When compiled with -O0, it does not crash.

Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f40d6ea7613 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
#2  0x00007f40d6e4ed0e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f40d6e36942 in __GI_abort () at abort.c:79
#4  0x00007f40d6e377a7 in __libc_message_impl (fmt=fmt@entry=0x7f40d6fc1fea "%s") at ../sysdeps/posix/libc_fatal.c:132
#5  0x00007f40d6e9a809 in __GI___libc_fatal (message=<optimized out>) at ../sysdeps/posix/libc_fatal.c:141
#6  0x00007f40d6e717ba in printf_positional (buf=buf@entry=0x7ffd796f1380, format=format@entry=0x55e4b3edde98 "%2$li", readonly_format=readonly_format@entry=0, ap=ap@entry=0x7ffd796f1510, 
    ap_savep=ap_savep@entry=0x7ffd796f0f28, nspecs_done=nspecs_done@entry=0, lead_str_end=<optimized out>, work_buffer=<optimized out>, save_errno=<optimized out>, grouping=<optimized out>, 
    thousands_sep=0x7f40d6fbeb15 "", mode_flags=<optimized out>) at vfprintf-internal.c:1270
#7  0x00007f40d6e7366a in __printf_buffer (buf=buf@entry=0x7ffd796f1380, format=0x55e4b3edde98 "%2$li", ap=0x7ffd796f1510, mode_flags=2) at vfprintf-internal.c:1041
#8  0x00007f40d6e99a73 in __vsnprintf_internal (string=<optimized out>, maxlen=maxlen@entry=2048, format=format@entry=0x55e4b3edde98 "%2$li", args=<optimized out>, mode_flags=mode_flags@entry=2)
    at vsnprintf.c:96
#9  0x00007f40d6f37022 in ___vsnprintf_chk (s=<optimized out>, maxlen=maxlen@entry=2048, flag=flag@entry=2, slen=slen@entry=18446744073709551615, format=format@entry=0x55e4b3edde98 "%2$li", ap=<optimized out>)
    at vsnprintf_chk.c:34
#10 0x00007f40d7760a00 in vsnprintf (__s=<optimized out>, __n=2048, __fmt=0x55e4b3edde98 "%2$li", __ap=<optimized out>) at /usr/include/bits/stdio2.h:100
#11 buf_vprintf (buf=buf@entry=0x7ffd796f14b0, fmt=fmt@entry=0x55e4b3edde98 "%2$li", args=args@entry=0x7ffd796f1510) at lib/util.c:1350
#12 0x00007f40d7804f5c in prot_vprintf (s=0x55e4e3aee530, fmt=0x55e4b3edde98 "%2$li", pvar=pvar@entry=0x7ffd796f1510) at lib/prot.c:1300
#13 0x00007f40d780507a in prot_printf (s=<optimized out>, fmt=fmt@entry=0x55e4b3edde98 "%2$li") at lib/prot.c:1288
#14 0x000055e4b3ea193b in capa_response (flags=flags@entry=3) at imap/imapd.c:3735
#15 0x000055e4b3ec2f7c in cmd_capability (tag=0x55e4e3af30f0 "1") at imap/imapd.c:3751
#16 cmdloop () at imap/imapd.c:1650
#17 0x000055e4b3ec605f in service_main (argc=<optimized out>, argv=<optimized out>, envp=envp@entry=0x7ffd796f40a8) at imap/imapd.c:1209
#18 0x000055e4b3e9d6fc in main (argc=<optimized out>, argv=<optimized out>, envp=0x7ffd796f40a8) at master/service.c:647
(gdb)

(gdb) f 6
#6  0x00007f40d6e717ba in printf_positional (buf=buf@entry=0x7ffd796f1380, format=format@entry=0x55e4b3edde98 "%2$li", readonly_format=readonly_format@entry=0, ap=ap@entry=0x7ffd796f1510, 
    ap_savep=ap_savep@entry=0x7ffd796f0f28, nspecs_done=nspecs_done@entry=0, lead_str_end=<optimized out>, work_buffer=<optimized out>, save_errno=<optimized out>, grouping=<optimized out>, 
    thousands_sep=0x7f40d6fbeb15 "", mode_flags=<optimized out>) at vfprintf-internal.c:1270
1270            __libc_fatal ("*** invalid %N$ use detected ***\n");
(gdb)

(gdb) f 14
#14 0x000055d9f089f93b in capa_response (flags=flags@entry=3) at imap/imapd.c:3735
3735                    prot_printf(imapd_out, valfmt, s, i64);
(gdb) list
3730                }
3731
3732                do {
3733                    prot_putc(' ', imapd_out);
3734                    prot_puts(imapd_out, capa);
3735                    prot_printf(imapd_out, valfmt, s, i64);
3736
3737                } while ((mask & CAPA_MULTI) && (++n < num) && (s = *(++strp)));
3738            }
3739        }
(gdb)

(gdb) f 15
#15 0x000055d9f08c0f7c in cmd_capability (tag=0x55da0ced30f0 "1") at imap/imapd.c:3751
3751        capa_response(CAPA_PREAUTH|CAPA_POSTAUTH);
(gdb)
Build info:
   shared:             yes
   static:             no
   cflags:             -O2  -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer  -DPCRE2POSIX_SHARED
   cxxflags:           -O2  -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer 
   libs:                -lgssapi_krb5 -lkrb5 -lk5crypto -lcom_err -lkrb5support -lresolv -lresolv  -lresolv   -L/usr/lib64/ -lmariadb  -lpq -lpcre2-posix -lpcre2-8 -lz -lxml2 -lical -licalss -licalvcal  -lsqlite3
   ldflags:            -Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes 
   libm:               -lm
   unit tests (cunit): yes
   xxd:                /usr/bin/xxd
# rpm -q gcc glibc
gcc-14.1.1-7.fc41.x86_64
glibc-2.39.9000-30.fc41.x86_64
#

Steps to reproduce:

After building with -02, just connecting with cyradm leads to the crash:

# cyradm -u cyrus -w '' -a plain localhost

I was trying to find out what changes would be needed for the 3.10.0 version today:

https://src.fedoraproject.org/fork/mosvald/rpms/cyrus-imapd/c/785dbc4c9f42b5385b8e43e43b33c124146d2869

and hit these crashes. All the Cassandane tests were crashing for me. Tried to turn off LTO, and compile it on f39-f41, but with the same result. I haven't managed to dig deeper yet. Maybe it rings a bell for you.

@elliefm
Copy link
Contributor

elliefm commented Jul 14, 2024

Cyrus doesn't support compiling with LTO, though I don't remember why off the top of my head. But if it still crashes when compiled without LTO, that's quite interesting. I'll try to reproduce the crash myself and see what shakes out.

@elliefm elliefm self-assigned this Jul 14, 2024
@elliefm
Copy link
Contributor

elliefm commented Jul 15, 2024

The top of the call stack looks a lot like #1981, though the bottom is different. I wonder if we have a bad format string in the IMAP capabilities table.

If you turn off FORTIFY_SOURCE but leave optimisations on, does the problem go away?

@elliefm
Copy link
Contributor

elliefm commented Jul 15, 2024

This is not fully tested yet but might get you moving forward:

diff --git a/imap/imapd.c b/imap/imapd.c
index 18cc596ed..69b4f7e16 100644
--- a/imap/imapd.c
+++ b/imap/imapd.c
@@ -398,7 +398,7 @@ static struct capa_struct base_capabilities[] = {
     { "ACL",                   CAPA_POSTAUTH,           { 0 } }, /* RFC 4314 */
     { "ANNOTATE-EXPERIMENT-1", CAPA_POSTAUTH,           { 0 } }, /* RFC 5257 */
     { "APPENDLIMIT=",          CAPA_POSTAUTH|CAPA_VALUE,         /* RFC 7889 */
-      { .value = { "%2$" PRIi64, .i64p = &maxmsgsize } }      },
+      { .value = { "%1$s%2$" PRIi64, .i64p = &maxmsgsize } }      },
     { "AUTH=",                 CAPA_OMNIAUTH|CAPA_COMPLEX,       /* RFC 9051 */
       { .complex = &capa_auth }                               },
     { "BINARY",                CAPA_POSTAUTH,           { 0 } }, /* RFC 3516 */

@elliefm
Copy link
Contributor

elliefm commented Jul 15, 2024

For me, with -O3, FORTIFY_SOURCE, and this patch, one Cassandane test fails in a way that I don't understand yet, and ~50 or so tests fail because of problems examining syslog.

If LD_PRELOAD isn't inherited by child processes under FORTIFY_SOURCE, which seems plausible, that would explain the syslog problem. Cassandane tries to detect when its syslog replacement isn't available, but in this case it thinks it's available when it's not. I have an idea of how to make the detection more robust, though it's a distraction rather than a real problem...

@mosvald
Copy link
Contributor Author

mosvald commented Jul 16, 2024

Thanks for looking into it so quickly!

Cyrus doesn't support compiling with LTO, though I don't remember why off the top of my head. But if it still crashes when compiled without LTO, that's quite interesting. I'll try to reproduce the crash myself and see what shakes out.

Yes, there were compilation errors regarding 'attribute((always_inline, optimize("-O3")))' #4679.

The top of the call stack looks a lot like #1981, though the bottom is different. I wonder if we have a bad format string in the IMAP capabilities table.

If you turn off FORTIFY_SOURCE but leave optimisations on, does the problem go away?

Indeed, with '%define _fortify_level 0' it doesn't end up with __libc_fatal().

This is not fully tested yet but might get you moving forward:

diff --git a/imap/imapd.c b/imap/imapd.c
index 18cc596ed..69b4f7e16 100644
--- a/imap/imapd.c
+++ b/imap/imapd.c
@@ -398,7 +398,7 @@ static struct capa_struct base_capabilities[] = {
     { "ACL",                   CAPA_POSTAUTH,           { 0 } }, /* RFC 4314 */
     { "ANNOTATE-EXPERIMENT-1", CAPA_POSTAUTH,           { 0 } }, /* RFC 5257 */
     { "APPENDLIMIT=",          CAPA_POSTAUTH|CAPA_VALUE,         /* RFC 7889 */
-      { .value = { "%2$" PRIi64, .i64p = &maxmsgsize } }      },
+      { .value = { "%1$s%2$" PRIi64, .i64p = &maxmsgsize } }      },
     { "AUTH=",                 CAPA_OMNIAUTH|CAPA_COMPLEX,       /* RFC 9051 */
       { .complex = &capa_auth }                               },
     { "BINARY",                CAPA_POSTAUTH,           { 0 } }, /* RFC 3516 */

Also after applying the patch, it doesn't crash.

For me, with -O3, FORTIFY_SOURCE, and this patch, one Cassandane test fails in a way that I don't understand yet, and ~50 or so tests fail because of problems examining syslog.

I am not sure why yet but for me the tests otherwise fail due to a missing filename in sun_path when it tries to remove a socket file, but this looks unrelated (it worked on 3.8.3 though).

[408805] =====> Instance::_check_syslog[1517] Cassandane::Instance=HASH(0x5563f2a5bad0) syslog error: Jul 16 04:45:59 snip.redhat.com 084302022E/imap[413038]: IOERROR: unlink failed: pathname=</tmp/rpmbuild/BUILD/cyrus-imapd-3.10.0-build/cyrus-imapd-3.10.0-rc1/cassandane/work/084302022E/conf/socket/> syserror=<Is a directory> file=<imap/idlemsg.c> line=<130> func=<idle_init_sock>

If LD_PRELOAD isn't inherited by child processes under FORTIFY_SOURCE, which seems plausible, that would explain the syslog problem. Cassandane tries to detect when its syslog replacement isn't available, but in this case it thinks it's available when it's not. I have an idea of how to make the detection more robust, though it's a distraction rather than a real problem...

It could be. If started as root, only elements without slash get read from LD_PRELOAD and searched only in standard paths.

I might be hitting similar issues even on 3.8.3. I have some tests disabled where it looked related to parsing results from syslog, I may open an issue for it in the future when I get to it, otherwise, I have to patch the syslog routines so they call only fake_vsyslog() not to spam system logs on our CI test machines:

https://src.fedoraproject.org/rpms/cyrus-imapd/blob/rawhide/f/patch-cassandane-no-syslog

@elliefm
Copy link
Contributor

elliefm commented Jul 16, 2024

I think I understand the syslog problem. It's not that FORTIFY_SOURCE affects LD_PRELOAD, it's that it affects syslog(). When FORTIFY_SOURCE is in play, syslog() becomes an extern inline function that calls into the version with the fortify checks. The calls from Cyrus to syslog() get inlined when optimisations are enabled, and so the LD_PRELOAD'd replacement never gets invoked.

The Cassandane utils don't inherit Cyrus's build settings, so they get built without FORTIFY_SOURCE. That means that when syslog_probe runs to check whether syslog replacement is available, it calls the LD_PRELOAD'd replacement, which works, so it marks syslog replacement as available. And then the tests that look for details in syslog fail because they expect to find things that aren't there.

From this discussion it looks like I can probably fix this by adding __vsyslog_chk() to the symbols the LD_PRELOAD module replaces. I'll try this out shortly.

I have to patch the syslog routines so they call only fake_vsyslog() not to spam system logs on our CI test machines

Rather than patching code, you should be able to configure syslog on your test machines to send LOG_LOCAL6 to /dev/null, or its own file, or whatever is convenient.

@elliefm
Copy link
Contributor

elliefm commented Jul 17, 2024

Can't just intercept __vsyslog_chk() because it's a hidden symbol, damn. And everything that calls it is inlined so can't be intercepted either.

If the Cassandane utils are built with FORTIFY_SOURCE then the probe fails and the syslog replacement is correctly detected as unavailable, and tests stop looking for nonexistent syslog messages. But compiling them like that is fiddly and prone to being forgotten or not synchronised with the Cyrus build. Next plan will be having Cyrus itself report when it was was built with FORTIFY_SOURCE (via cyr_buildinfo), and disabling the syslog checks in Cassandane when it was...

@mosvald
Copy link
Contributor Author

mosvald commented Jul 17, 2024

Can't just intercept __vsyslog_chk() because it's a hidden symbol, damn. And everything that calls it is inlined so can't be intercepted either.

from what I see __vsyslog_chk is not hidden:

mosvald@fedora:~$ readelf --dyn-sym -W /lib64/libc.so.6  | grep __vsyslog_chk
   441: 0000000000114280    26 FUNC    GLOBAL DEFAULT   16 __vsyslog_chk@@GLIBC_2.4
mosvald@fedora:~$

so you could do the same as for the openlog/vsyslog/closelog, i.e. export __vsyslog_chk and dlsym(RTLD_NEXT, "__vsyslog_chk")

From this discussion it looks like I can probably fix this by adding __vsyslog_chk() to the symbols the LD_PRELOAD module replaces. I'll try this out shortly.

Not sure why I didn't open an issue for it when I hit the same while writing the patch to turn off syslog logging that __vsyslog_chk() got called instead of syslog() for me leading to failed tests because syslog file was 0 in size. I ended up reading the same StackOverflow thread. :D

@elliefm
Copy link
Contributor

elliefm commented Jul 18, 2024

That's exactly what I did, and it didn't work. I spotted this in the copy of syslog.c that the SO thread linked to, which seemed to explain it, so I didn't dig further:

https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/misc/syslog.c#L319

It might be that with your build options, libc_hidden_def does nothing, and with mine it hides the symbol? Though the code I'm looking at is old and maybe it's not even the same anymore. But either way, I couldn't persuade it to replace __vsyslog_chk in my environment, which means that approach can't be relied on.

The state that I'm at now is I have a set of commits that fix the Cyrus sources so that:

  • it compiles correctly with -Wall -Wextra -Werror -O3 -D_FORTIFY_SOURCE (among others)
  • imapd doesn't SIGABRT from printf_positional
  • Cyrus reports when it was compiled with source fortification
  • Cassandane detects when Cyrus was compiled with source fortification and doesn't expect to be able to read syslog messages in this case

I can't get a clean Cassandane run out of it though. A bunch of random tests fail each run due to Cassandane's SMTP thingy not exiting all its processes fast enough. The SMTP thingy is known to be a mess with a lot of issues, but this one is new. It's perl, so I I don't see why source fortification of Cyrus would affect it. I'm not convinced it's worth my time to chase that right now, since we already know the SMTP thingy needs a rewrite anyway, so I'll tidy up my current patches into a PR and get this moved forward.

elliefm added a commit that referenced this issue Jul 24, 2024
printf(3):
> There may be no gaps in the numbers of arguments specified using '$';
> for example, if  arguments  1  and  3 are specified, argument 2 must
> also be specified somewhere in the format string.

The bad format string caused a SIGABRT during capabilities response when
Cyrus was compiled with glibc source fortification enabled.

Fixes #4972
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants