Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process-specific paramaters are stored in common /src/ directory #644

Closed
zeniheisser opened this issue Apr 26, 2023 · 10 comments · Fixed by #560
Closed

Process-specific paramaters are stored in common /src/ directory #644

zeniheisser opened this issue Apr 26, 2023 · 10 comments · Fixed by #560
Assignees

Comments

@zeniheisser
Copy link
Contributor

This is a generalisation of an issue previously discussed with @valassi and @roiser. The long and short of it is that some parameters which should be stored in the P-directories are stored in the src-directory. Here are two examples.

generate p p > t t~ g g

This process has two subprocesses, g g > t t~ g g and u u~ > t t~ g g (the latter one being the combined file for the four different quark cases). Both subprocesses compile, but g g > t t~ g g seg faults at runtime due to needing to store more internal wavefunctions in memory than are allocated. Shown in the warning raised at compile time:
CPPProcess.cc:1723:36: warning: array subscript 25 is above array bounds of ‘mgOnGpu::fptype* [17]’ {aka ‘double* [17]’} [-Warray-bounds] 1723 | VVV1P0_1<W_ACCESS, CD_ACCESS>( w_fp[0], w_fp[7], COUPs[0], 0., 0., w_fp[25] ); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ CPPProcess.cc:160:13: note: while referencing ‘w_fp’ 160 | fptype* w_fp[nwf]; | ^~~~

Another case is when running multiprocesses, e.g.
generate g g > t t~ add process g g > t t~ g
which will generate code with multiple subprocess directories, i.e. directories P1... and P2.... In this case, g g > t t~ g fails already at compile time,
``
gCPPProcess.cu(639): error: too many initializer values

gCPPProcess.cu(640): error: too many initializer values

gCPPProcess.cu(641): error: too many initializer values

gCPPProcess.cu(642): error: too many initializer values

gCPPProcess.cu(643): error: too many initializer values

gCPPProcess.cu(644): error: too many initializer values

gCPPProcess.cu(645): error: too many initializer values

gCPPProcess.cu(646): error: too many initializer values

gCPPProcess.cu(647): error: too many initializer values

gCPPProcess.cu(648): error: too many initializer values

gCPPProcess.cu(649): error: too many initializer values

gCPPProcess.cu(650): error: too many initializer values

gCPPProcess.cu(651): error: too many initializer values

gCPPProcess.cu(652): error: too many initializer values

gCPPProcess.cu(653): error: too many initializer values

gCPPProcess.cu(654): error: too many initializer values

gCPPProcess.cu(655): error: too many initializer values

CPPProcess.cc: In constructor ‘mg5amcCpu::CPPProcess::CPPProcess(bool, bool)’:
CPPProcess.cc:670:26: error: too many initializers for ‘const short int [4]’
670 | { 1, 1, 1, -1, 1 } };
| ^
17 errors detected in the compilation of "gCPPProcess.cu".
``

Both of these are instances of the same underlying issue: in the file mgOnGpuConfig.h (and similarly I expect for C++?) there is a block with information which is process dependent:
115 // --- Physics process-specific constants that are best declared at compile time 116 117 const int np4 = 4; // dimensions of 4-momenta (E,px,py,pz) 118 119 const int npari = 2; // #particles in the initial state (incoming): e.g. 2 (e+ e-) for e+ e- -> mu+ mu- 120 121 const int nparf = 2; // #particles in the final state (outgoing): e.g. 2 (mu+ mu-) for e+ e- -> mu+ mu- 122 123 const int npar = npari + nparf; // #particles in total (external = initial + final): e.g. 4 for e+ e- -> mu+ mu- 124 125 const int ncomb = 16; // #helicity combinations: e.g. 16 for e+ e- -> mu+ mu- (2**4 = fermion spin up/down ** npar) 126 127 const int nw6 = 6; // dimensions of each wavefunction (HELAS KEK 91-11): e.g. 6 for e+ e- -> mu+ mu- (fermions and vectors) 128 129 const int nwf = 5; // #wavefunctions = #external (npar) + #internal: e.g. 5 for e+ e- -> mu+ mu- (1 internal is gamma or Z)
Aside from np4, all of these can vary between subprocesses -- essentially, the /src/ directory can only contain things that should be shared between all possible processes, and anything that varies with the given process (at least within a single UFO model) should be kept exclusively within the P-subdirectories.

@valassi
Copy link
Member

valassi commented Apr 26, 2023

Thanks @zeniheisser .

This is what I was having a look at in MR #560 for pp to tttt

valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
…rocesses/P1*/CPPProcess.h (fix madgraph5#644)

(NB note the difference between nwavefun and nwavefuncs - trailing s - in python!?)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
… when creating CPPProcess.h (only nwavefuncs was defined - with leading s and with wrong value!)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
Revert "[pp4t] regenerate ggtt.sa: P1 directory gets the wrong nwf=7 instead of nwf=5 ...!? (madgraph5#644 is not yet fixed, will revert)"
This reverts commit eae8037.
@valassi
Copy link
Member

valassi commented Apr 26, 2023

I am fixing this in MR #560.... but still not yet there... different variables nwavefunc and nwavefunc, I get 5, 6, or 7 depending on what I call where...?

valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 26, 2023
@valassi
Copy link
Member

valassi commented Apr 26, 2023

Hi @zeniheisser @roiser @oliviermattelaer I am doing this again, and it is really a bad idea: I thought I would fix this "in 5 minutes", and it will take hours. So, I just do not have the time now _ I will resume this in two weeks.

The issue is that I cannot get the right value of nwf, which for ggtt is 5:

  • if I use 'nwavefuncs' from export_cpp I get 6
  • If I use self.matrix_elements[0].get_number_of_wavefunctions() when I need it, I get 7
  • If I use self.matrix_elements[0].get_number_of_wavefunctions() where I was calculating it before, while writing mgOnGpuConfig.h, I get the right value, 5!

Note, these are tests on ggtt which is simple and has a single P1 subdirectory. I could understand that the value changes where you hav emore than one P1, but here I am puzzled why I get 7 and then this changes to 5.

I will almost certainly not have tome tomorrow. Sorry, another one on the list for later on...

@valassi
Copy link
Member

valassi commented Apr 26, 2023

I have a lot of ugly WIP stuff in #560 to clean up eventually

@roiser
Copy link
Member

roiser commented May 19, 2023

Just to mention, for the current gridpack generation I am hacking around this by getting the max number of the w_fp array from all CPPProcess.cc files and setting that in the src/mgOnGpuConfig.h nwf variable accordingly.

valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
…from src/mgOnGpuConfig.h to SubProcesses/P1*/CPPProcess.h

(NB note the difference between nwavefun and nwavefuncs - trailing s - in python!?)
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
… "nwavefunc" when creating CPPProcess.h (only nwavefuncs was defined - with leading s and with wrong value!)
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
Revert "[pp4t] regenerate ggtt.sa: P1 directory gets the wrong nwf=7 instead of nwf=5 ...!? (madgraph5#644 is not yet fixed, will revert)"
This reverts commit eae8037.
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
…ue of nwf (madgraph5#644)"

Revert "[pp4t] in codegen, next (ugly) attempt to fix the P1-specific value of nwf (madgraph5#644), will revert"
This reverts commit 146cd26
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 22, 2023
…wf (madgraph5#644), clean up and disable debug printouts for nwf

(NB: I checked that pp_tttt now generates and builds correctly in both P1 subdirectories)
@valassi
Copy link
Member

valassi commented May 22, 2023

Ho @zeniheisser @roiser (and @oliviermattelaer for info), I think that MR #560 is now almost ready to merge, including a fix for this #644: I moved nwf from mgOnGpuConfig.h to the calculate_wavefunction deeply hardcode in CppProcess.cc. I was not able to do it in CppProcess.h, because the correct result for nwf is only available after a call to self.helas_call_writer.get_matrix_element_calls. Anyway, it looks ok now to me.

I have tested that pp_tttt now seems to generate and build correctly with the right nwf in both P1 subdirectories. Please test also your own processes after I merge (or even now from my pp4t branch). Thanks

@valassi valassi self-assigned this May 22, 2023
@valassi
Copy link
Member

valassi commented May 22, 2023

The issue is that I cannot get the right value of nwf, which for ggtt is 5:

* if I use 'nwavefuncs' from export_cpp I get 6

* If I use `self.matrix_elements[0].get_number_of_wavefunctions()` when I need it, I get 7

* If I use `self.matrix_elements[0].get_number_of_wavefunctions()` where I was calculating it before, while writing mgOnGpuConfig.h, I get the right value, 5!

Note, these are tests on ggtt which is simple and has a single P1 subdirectory. I could understand that the value changes where you hav emore than one P1, but here I am puzzled why I get 7 and then this changes to 5.

On this specific point, as mentioned in my previous comment, my impression is that the right value of nwf is only available in the python after a call to self.helas_call_writer.get_matrix_element_calls. This is the reason why I hardcodd nwf deep inside CPPProcess.cc, it is because I only call self.helas_call_writer.get_matrix_element_calls while writing CPPProcess.cc.

@valassi
Copy link
Member

valassi commented May 23, 2023

I have just merged MR #560 which closes this issue.

@zeniheisser @roiser please check if it works for you... thanks

valassi added a commit to valassi/madgraph4gpu that referenced this issue May 25, 2023
NB: I also checked that pp_tttt can be generated correctly, and both P1 directories build fine (here nwf was giving issues in MR madgraph5#560 for madgraph5#644)
@valassi
Copy link
Member

valassi commented May 27, 2023

PS This issue (different nwf in different P1) is similar to #667 (different npar in different P1 and P2)

@valassi
Copy link
Member

valassi commented Jun 3, 2023

Thanks to @nscottnichols for suggesting that some improvements are possible - to be followed up in #677

valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this issue Aug 16, 2023
…, move nwf from src/mgOnGpuConfig.h to SubProcesses/P1*/CPPProcess.h

(NB note the difference between nwavefun and nwavefuncs - trailing s - in python!?)
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this issue Aug 16, 2023
…u#644: define "nwavefunc" when creating CPPProcess.h (only nwavefuncs was defined - with leading s and with wrong value!)
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this issue Aug 16, 2023
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this issue Aug 16, 2023
…ue of nwf (madgraph5/madgraph4gpu#644)"

Revert "[pp4t] in codegen, next (ugly) attempt to fix the P1-specific value of nwf (madgraph5/madgraph4gpu#644), will revert"
This reverts commit 146cd269d2aa49522794d2b06c689b0c48e2c43b
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this issue Aug 16, 2023
…wf (madgraph5/madgraph4gpu#644), clean up and disable debug printouts for nwf

(NB: I checked that pp_tttt now generates and builds correctly in both P1 subdirectories)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants