Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

growVector, copyAsPlain may formally cause undefined behaviour with 0-length vectors #6819

Open
aitap opened this issue Feb 17, 2025 · 0 comments · May be fixed by #6820
Open

growVector, copyAsPlain may formally cause undefined behaviour with 0-length vectors #6819

aitap opened this issue Feb 17, 2025 · 0 comments · May be fixed by #6820

Comments

@aitap
Copy link
Contributor

aitap commented Feb 17, 2025

Found while working on a custom Debian testing-based container (using C compiler: ‘Debian clang version 19.1.7 (1+b1)’) with Clang sanitizers enabled for #6746:

When trying to access the contents of a zero-length vector using INTEGER(...) or REAL(...) or other accessor, R may return an invalid pointer (0x1). The C standard says that giving an invalid pointer to memcpy() is undefined behaviour, even though in practice nothing breaks (memcpy sees n=0 and doesn't dereference it).

If nothing breaks, what's the risk? One of the CRAN special checks (clang-UBSAN or 0len) might pick this up too. Very far-fetched, a compiler might optimize away a chunk of code deemed to cause undefined behaviour.

Running test id 173.1

dogroups.c:541:39: runtime error: load of misaligned address 0x000000000001 for type 'int *', which requires 4 byte alignment
0x000000000001: note: pointer points here
<memory cannot be printed>
    #0 0x7f2efbaec116 in growVector /work/data.table.Rcheck/00_pkg_src/data.table/src/dogroups.c
    #1 0x7f2efbae90d5 in dogroups /work/data.table.Rcheck/00_pkg_src/data.table/src/dogroups.c:409:66

(gdb) frame 4
#4  0x00007fc6d4aec117 in growVector (x=0x52500661b038, newlen=newlen@entry=3) at dogroups.c:543
543       case CPLXSXP: memcpy(COMPLEX(newx), COMPLEX(x), len*SIZEOF(x)); break;
(gdb) p Rf_xlength(x)
$4 = 0
(gdb) p Rf_xlength(newx)
$5 = 3
(gdb) call Rf_PrintValue(R_GlobalContext->call)
`[.data.table`(DT, , B[B > 3], by = A)
Running test id 893.5

utils.c:233:12: runtime error: store to misaligned address 0x000000000001 for type 'int *', which requires 4 byte alignment
0x000000000001: note: pointer points here
<memory cannot be printed>
    #0 0x7f2efbbfd5f9 in copyAsPlain /work/data.table.Rcheck/00_pkg_src/data.table/src/utils.c:233:5
    #1 0x7f2efbbefbf9 in subsetDT /work/data.table.Rcheck/00_pkg_src/data.table/src/subset.c:317:30

(gdb) frame 4
#4  0x00007fc6d4bfd5fa in copyAsPlain (x=x@entry=0x525003c44b68) at utils.c:233
233         memcpy(INTEGER(ans), INTEGER(x), n*sizeof(int));             // covered by 10:1 after test 178
(gdb) p Rf_xlength(x)
$8 = 0
(gdb) call Rf_PrintValue(R_GlobalContext->call)
`[.data.table`(head(DT, nr), , seq_len(if (nc == 0) ncol(DT) else nc),
    with = FALSE)
Running test id 2150.21

dogroups.c:540:39: runtime error: load of misaligned address 0x000000000001 for type 'int *', which requires 4 byte alignment
0x000000000001: note: pointer points here
<memory cannot be printed>
    #0 0x7f2efbaec116 in growVector /work/data.table.Rcheck/00_pkg_src/data.table/src/dogroups.c
    #1 0x7f2efbb46d4e in allocateDT /work/data.table.Rcheck/00_pkg_src/data.table/src/freadR.c:501:36
    #2 0x7f2efbb2f967 in freadMain /work/data.table.Rcheck/00_pkg_src/data.table/src/fread.c:2666:7
    #3 0x7f2efbb42306 in freadR /work/data.table.Rcheck/00_pkg_src/data.table/src/freadR.c:222:3

(gdb) frame 4
#4  0x00007fc6d4aec117 in growVector (x=x@entry=0x525004e417e8, newlen=newlen@entry=1024)
    at dogroups.c:543
543       case CPLXSXP: memcpy(COMPLEX(newx), COMPLEX(x), len*SIZEOF(x)); break;
(gdb) p Rf_xlength(x)
$11 = 0
(gdb) call Rf_PrintValue(R_GlobalContext->call)
fread("c1\n2018-01-31 03:16:57")

(Yes, that case CPLSXP: looks a bit strange. clang must have merged the branches into one with different length arguments to memcpy().)

@aitap aitap linked a pull request Feb 17, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant