Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[cmsdy] in dsample.f of pp_dy3j.mad P0_gux_taptamggux [!NB: forgot to…
… copy this to gg_tt.mad!], skip xbin checks if CUDACPP_RUNTIME_SKIPXBINCHECKS is set (part3 of madgraph5#969) This is a very large improvement, but it may be more controversial, hence it is disabled by default... CUDACPP_RUNTIME_DISABLEFPE=1 ./build.cuda_d_inl0_hrd0/madevent_cuda < /tmp/avalassi/input_dy3j_x1_cudacpp [COUNTERS] PROGRAM TOTAL : 4.1142s [COUNTERS] Fortran Other ( 0 ) : 0.1610s [COUNTERS] Fortran Initialise(I/O) ( 1 ) : 0.0670s [COUNTERS] Fortran Random2Momenta ( 3 ) : 2.8821s for 1170103 events => throughput is 2.46E-06 events/s [COUNTERS] Fortran PDFs ( 4 ) : 0.0962s for 49152 events => throughput is 1.96E-06 events/s [COUNTERS] Fortran UpdateScaleCouplings ( 5 ) : 0.1278s for 16384 events => throughput is 7.80E-06 events/s [COUNTERS] Fortran Reweight ( 6 ) : 0.0485s for 16384 events => throughput is 2.96E-06 events/s [COUNTERS] Fortran Unweight(LHE-I/O) ( 7 ) : 0.0670s for 16384 events => throughput is 4.09E-06 events/s [COUNTERS] Fortran SamplePutPoint ( 8 ) : 0.1355s for 1170103 events => throughput is 1.16E-07 events/s [COUNTERS] CudaCpp Initialise ( 11 ) : 0.4683s [COUNTERS] CudaCpp Finalise ( 12 ) : 0.0262s [COUNTERS] CudaCpp MEs ( 19 ) : 0.0348s for 16384 events => throughput is 2.13E-06 events/s [COUNTERS] OVERALL NON-MEs ( 21 ) : 4.0794s [COUNTERS] OVERALL MEs ( 22 ) : 0.0348s for 16384 events => throughput is 2.13E-06 events/s CUDACPP_RUNTIME_SKIPXBINCHECKS=1 CUDACPP_RUNTIME_DISABLEFPE=1 ./build.cuda_d_inl0_hrd0/madevent_cuda < /tmp/avalassi/input_dy3j_x1_cudacpp [COUNTERS] PROGRAM TOTAL : 3.2969s [COUNTERS] Fortran Other ( 0 ) : 0.1726s [COUNTERS] Fortran Initialise(I/O) ( 1 ) : 0.0674s [COUNTERS] Fortran Random2Momenta ( 3 ) : 2.0464s for 1170103 events => throughput is 1.75E-06 events/s [COUNTERS] Fortran PDFs ( 4 ) : 0.0958s for 49152 events => throughput is 1.95E-06 events/s [COUNTERS] Fortran UpdateScaleCouplings ( 5 ) : 0.1298s for 16384 events => throughput is 7.92E-06 events/s [COUNTERS] Fortran Reweight ( 6 ) : 0.0482s for 16384 events => throughput is 2.94E-06 events/s [COUNTERS] Fortran Unweight(LHE-I/O) ( 7 ) : 0.0656s for 16384 events => throughput is 4.00E-06 events/s [COUNTERS] Fortran SamplePutPoint ( 8 ) : 0.1412s for 1170103 events => throughput is 1.21E-07 events/s [COUNTERS] CudaCpp Initialise ( 11 ) : 0.4685s [COUNTERS] CudaCpp Finalise ( 12 ) : 0.0266s [COUNTERS] CudaCpp MEs ( 19 ) : 0.0349s for 16384 events => throughput is 2.13E-06 events/s [COUNTERS] OVERALL NON-MEs ( 21 ) : 3.2620s [COUNTERS] OVERALL MEs ( 22 ) : 0.0349s for 16384 events => throughput is 2.13E-06 events/s
- Loading branch information