Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assiging 0.0s to an array gives memset error #2117

Open
ipcamit opened this issue Oct 13, 2024 · 1 comment
Open

Assiging 0.0s to an array gives memset error #2117

ipcamit opened this issue Oct 13, 2024 · 1 comment

Comments

@ipcamit
Copy link

ipcamit commented Oct 13, 2024

Issue:
I get the following error:

error: Enzyme: Cannot deduce type of memset   call void @llvm.memset.p0.i64(ptr noundef nonnull align 4 dereferenceable(108) %3, i8 0, i64 108, i1 false) #7
<analysis>
ptr %0: {[-1]:Pointer}, intvals: {}
ptr %1: {[-1]:Pointer}, intvals: {}
ptr %2: {[-1]:Pointer}, intvals: {}
ptr %3: {[-1]:Pointer}, intvals: {}
</analysis>

in function:

void irreps_to_out(float* x_, float* y_, float* z_, float* out_feature){
    float x = *x_; float y = *y_; float z = *z_;

    float *product = (float*) malloc(27 * sizeof(float));
    for(int i = 0; i < 27; i++) { product[i] = 0.0; }  // <--- Problematic line
    for(int i = 0; i < 27; i++) { out_feature[i] = product[i]; }
}

Issue is specifically that the compiler optimizes this assignment to a memset call if assigned value is 0.0. For any other assignment like

    for(int i = 0; i < 27; i++) { product[i] = 0.0000000000001; }  // <--- Problematic line

it compiles fine.

It also compiles if I give option -O0 thus avoiding optimization of assignment to memset.


Enzyme Explorer:

https://fwd.gymni.ch/ecQeCs


Full example:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

void irreps_to_out(float* x_, float* y_, float* z_, float* out_feature)
{
    float x = *x_;
    float y = *y_;
    float z = *z_;

    // float product[27];
    float *product = (float*) malloc(27 * sizeof(float));
    for(int i = 0; i < 27; i++) { product[i] = 0.0; }  // <--- Problematic line
    for(int i = 0; i < 27; i++) { out_feature[i] = product[i]; }

}

int enzyme_dup;

void __enzyme_autodiff(void (*) (float*, float*, float*, float*), 
                        int, float*, float*,
                        int, float*, float*, 
                        int, float*, float*,
                        int, float*, float*);

int main(){
    float out_feature[27] = { 0., 0., 0.,0., 0., 0.,0., 0., 0.,
                               0., 0., 0.,0., 0., 0.,0., 0., 0.,
                                0., 0., 0.,0., 0., 0.,0., 0., 0. };
    float out_grad[27] = {1., 1., 1., 1., 1., 1., 1., 1., 1.,
                          1., 1., 1., 1., 1., 1., 1., 1., 1.,
                          1., 1., 1., 1., 1., 1., 1., 1., 1.};

    float x = 1.0, y = 2.0, z = 3.0;
    float x_grad = 0.0, y_grad = 0.0, z_grad = 0.0;
    irreps_to_out(&x, &y, &z, out_feature);
   
    __enzyme_autodiff(irreps_to_out, 
                        enzyme_dup, &x, &x_grad, 
                        enzyme_dup, &y, &y_grad, 
                        enzyme_dup, &z, &z_grad, 
                        enzyme_dup, out_feature, out_grad);

    return 0;
}

compiled as

clang example.c  -Xclang -load -Xclang /path/to/ClangEnzyme-15.so -lm -O3
clang -v

clang version 15.0.7 (https://github.com/conda-forge/clangdev-feedstock 7546975a4a926b2b6b05f442d73827ff01b1ae76)
Target: x86_64-conda-linux-gnu
Thread model: posix
InstalledDir: /opt/mambaforge/mambaforge/envs/e3nn/bin
Found candidate GCC installation: /opt/mambaforge/mambaforge/envs/e3nn/bin/../lib/gcc/x86_64-conda-linux-gnu/14.1.0
Selected GCC installation: /opt/mambaforge/mambaforge/envs/e3nn/bin/../lib/gcc/x86_64-conda-linux-gnu/14.1.0
Candidate multilib: .;@m64
Selected multilib: .;@m64

Enzyme commit: 192d9231aaa5415fa135ffb3568cc8eceef512d5

@GregTheMadMonk
Copy link
Contributor

Sometimes using this flag helps with this issue https://enzyme.mit.edu/getting_started/UsingEnzyme/#loose-type-analysis

@wsmoses this is an issue that is encountered relatively often with Enzyme as far as I can tell. My understanding of it is that the compiler sometimes will replace initialization of a large object/multiple small objects with a memset of a large chunk of memory with a constant and Enzyme can't tell what data was supposed to be at the destination memory location. Is this correct? Do you maybe have any leads on how to improve this, or where to look if I want to improve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants