I repeated the tests from #58 to verify the results of the alpaka kernel ports of
commit 2f5cf96971c8 with openMP accelerator.
Setup on Hypnos Laser Nodes
- 10 nodes, each with 64 cores
- gcc 4.9.2
- openMPI 1.8.6
- matlab 2014a
- export OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
- softlinks to new glibc version by admin
Test Run ALPAKA
- 10 laser nodes, each with 64 cores
- Elapsed time 6491 s
Test Run CUDA native with bug fix
- 2 k20 nodes
- Elapsed time 6101s
Result
- A small deviation in comparison to cuda is visible (~ 3 * 10^-2)
- This version contains bug fix
- Values are still in strong agreement with the experimental measurements.
- Cuda bugfix vs. cuda before bugfix has deviation of ~1 * 10^-3
| Timestep |
cuda native 268b6cd |
alpaka openMP 2f5cf96 |
cuda native bugfix 825c6c7 |
| 0 |
0.7992 |
0.7992 |
0.7992 |
| 1 |
0.8296 |
0.8297 |
0.8297 |
| 2 |
0.8606 |
0.8607 |
0.8607 |
| 3 |
0.8921 |
0.8921 |
0.8921 |
| 4 |
0.9240 |
0.9241 |
0.9240 |
| 5 |
0.9562 |
0.9564 |
0.9563 |
| 6 |
0.9888 |
0.9890 |
0.9889 |
| 7 |
1.0217 |
1.0218 |
1.0217 |
| 8 |
1.0547 |
1.0549 |
1.0548 |
| 9 |
1.0879 |
1.0880 |
1.0880 |
| 10 |
1.1211 |
1.1212 |
1.1212 |
| 11 |
1.1544 |
1.1544 |
1.1544 |
| 12 |
1.1875 |
1.1874 |
1.1876 |
| 13 |
1.2205 |
1.2203 |
1.2206 |
| 14 |
1.2534 |
1.2530 |
1.2534 |
| 15 |
1.2859 |
1.2853 |
1.2860 |
| 16 |
1.3182 |
1.3173 |
1.3182 |
| 17 |
1.3500 |
1.3488 |
1.3501 |
| 18 |
1.3814 |
1.3798 |
1.3815 |
| 19 |
1.4123 |
1.4103 |
1.4124 |
| 20 |
1.4427 |
1.4402 |
1.4428 |
| 21 |
1.4725 |
1.4695 |
1.4726 |
| 22 |
1.5017 |
1.4981 |
1.5018 |
| 23 |
1.5302 |
1.5259 |
1.5303 |
| 24 |
1.5580 |
1.5530 |
1.5582 |
| 25 |
1.5851 |
1.5794 |
1.5852 |
| 26 |
1.6115 |
1.6049 |
1.6116 |
| 27 |
1.6370 |
1.6297 |
1.6371 |
| 28 |
1.6618 |
1.6535 |
1.6619 |
| 29 |
1.6857 |
1.6766 |
1.6859 |
| 30 |
1.7089 |
1.6988 |
1.7090 |
| 31 |
1.7312 |
1.7202 |
1.7313 |
| 32 |
1.7527 |
1.7407 |
1.7528 |
| 33 |
1.7733 |
1.7604 |
1.7735 |
| 34 |
1.7932 |
1.7793 |
1.7934 |
| 35 |
1.8122 |
1.7974 |
1.8125 |
| 36 |
1.8305 |
1.8146 |
1.8308 |
| 37 |
1.8479 |
1.8311 |
1.8483 |
| 38 |
1.8646 |
1.8468 |
1.8650 |
| 39 |
1.8806 |
1.8617 |
1.8809 |
| 40 |
1.8958 |
1.8759 |
1.8962 |
| 41 |
1.9103 |
1.8895 |
1.9107 |
| 42 |
1.9241 |
1.9023 |
1.9245 |
| 43 |
1.9373 |
1.9145 |
1.9376 |
| 44 |
1.9498 |
1.9260 |
1.9501 |
| 45 |
1.9617 |
1.9369 |
1.9620 |
| 46 |
1.9729 |
1.9472 |
1.9733 |
| 47 |
1.9836 |
1.9570 |
1.9839 |
| 48 |
1.9937 |
1.9662 |
1.9940 |
| 49 |
2.0032 |
1.9749 |
2.0036 |
| 50 |
2.0123 |
1.9831 |
2.0126 |
| 51 |
1.9502 |
1.9211 |
1.9506 |
| 52 |
1.8932 |
1.8644 |
1.8936 |
| 53 |
1.8407 |
1.8122 |
1.8410 |
| 54 |
1.7922 |
1.7641 |
1.7925 |
| 55 |
1.7472 |
1.7196 |
1.7476 |
| 56 |
1.7055 |
1.6783 |
1.7058 |
| 57 |
1.6667 |
1.6400 |
1.6669 |
| 58 |
1.6304 |
1.6042 |
1.6307 |
| 59 |
1.5964 |
1.5708 |
1.5967 |
| 60 |
1.5646 |
1.5396 |
1.5649 |
| 61 |
1.5347 |
1.5103 |
1.5350 |
| 62 |
1.5066 |
1.4828 |
1.5069 |
| 63 |
1.4802 |
1.4569 |
1.4805 |
| 64 |
1.4552 |
1.4324 |
1.4555 |
| 65 |
1.4316 |
1.4094 |
1.4318 |
| 66 |
1.4092 |
1.3876 |
1.4095 |
| 67 |
1.3880 |
1.3670 |
1.3883 |
| 68 |
1.3679 |
1.3474 |
1.3682 |
| 69 |
1.3488 |
1.3289 |
1.3490 |
| 70 |
1.3307 |
1.3112 |
1.3308 |
| 71 |
1.3133 |
1.2945 |
1.3135 |
| 72 |
1.2968 |
1.2785 |
1.2970 |
| 73 |
1.2811 |
1.2633 |
1.2813 |
| 74 |
1.2661 |
1.2488 |
1.2663 |
| 75 |
1.2517 |
1.2349 |
1.2519 |
| 76 |
1.2380 |
1.2216 |
1.2381 |
| 77 |
1.2248 |
1.2089 |
1.2250 |
| 78 |
1.2122 |
1.1968 |
1.2124 |
| 79 |
1.2001 |
1.1852 |
1.2003 |
| 80 |
1.1885 |
1.1740 |
1.1887 |
| 81 |
1.1774 |
1.1633 |
1.1775 |
| 82 |
1.1667 |
1.1530 |
1.1668 |
| 83 |
1.1564 |
1.1431 |
1.1566 |
| 84 |
1.1465 |
1.1336 |
1.1467 |
| 85 |
1.1370 |
1.1245 |
1.1371 |
| 86 |
1.1278 |
1.1157 |
1.1280 |
| 87 |
1.1190 |
1.1072 |
1.1191 |
| 88 |
1.1105 |
1.0990 |
1.1106 |
| 89 |
1.1023 |
1.0912 |
1.1024 |
| 90 |
1.0944 |
1.0836 |
1.0945 |
| 91 |
1.0867 |
1.0762 |
1.0868 |
| 92 |
1.0793 |
1.0691 |
1.0794 |
| 93 |
1.0722 |
1.0623 |
1.0723 |
| 94 |
1.0653 |
1.0557 |
1.0654 |
| 95 |
1.0586 |
1.0493 |
1.0587 |
| 96 |
1.0521 |
1.0431 |
1.0522 |
| 97 |
1.0459 |
1.0371 |
1.0460 |
| 98 |
1.0398 |
1.0313 |
1.0399 |
| 99 |
1.0340 |
1.0257 |
1.0341 |
| 100 |
1.0283 |
1.0203 |
1.0284 |
| 101 |
1.0228 |
1.0150 |
1.0229 |
| 102 |
1.0174 |
1.0099 |
1.0175 |
| 103 |
1.0123 |
1.0049 |
1.0123 |
| 104 |
1.0072 |
1.0001 |
1.0073 |
| 105 |
1.0024 |
0.9955 |
1.0024 |
| 106 |
0.9976 |
0.9909 |
0.9977 |
| 107 |
0.9930 |
0.9865 |
0.9931 |
| 108 |
0.9886 |
0.9823 |
0.9887 |
| 109 |
0.9842 |
0.9781 |
0.9843 |
| 110 |
0.9800 |
0.9741 |
0.9801 |
| 111 |
0.9759 |
0.9702 |
0.9760 |
| 112 |
0.9720 |
0.9664 |
0.9720 |
| 113 |
0.9681 |
0.9627 |
0.9682 |
| 114 |
0.9643 |
0.9591 |
0.9644 |
| 115 |
0.9607 |
0.9555 |
0.9607 |
| 116 |
0.9571 |
0.9521 |
0.9572 |
| 117 |
0.9536 |
0.9488 |
0.9537 |
| 118 |
0.9502 |
0.9456 |
0.9503 |
| 119 |
0.9469 |
0.9424 |
0.9470 |
| 120 |
0.9437 |
0.9394 |
0.9438 |
| 121 |
0.9406 |
0.9364 |
0.9407 |
| 122 |
0.9376 |
0.9334 |
0.9376 |
| 123 |
0.9346 |
0.9306 |
0.9347 |
| 124 |
0.9317 |
0.9278 |
0.9318 |
| 125 |
0.9289 |
0.9251 |
0.9290 |
| 126 |
0.9262 |
0.9225 |
0.9262 |
| 127 |
0.9235 |
0.9199 |
0.9235 |
| 128 |
0.9209 |
0.9174 |
0.9209 |
| 129 |
0.9183 |
0.9150 |
0.9184 |
| 130 |
0.9158 |
0.9126 |
0.9159 |
| 131 |
0.9134 |
0.9103 |
0.9135 |
| 132 |
0.9110 |
0.9080 |
0.9111 |
| 133 |
0.9087 |
0.9058 |
0.9088 |
| 134 |
0.9065 |
0.9036 |
0.9065 |
| 135 |
0.9043 |
0.9015 |
0.9043 |
| 136 |
0.9021 |
0.8995 |
0.9022 |
| 137 |
0.9000 |
0.8974 |
0.9001 |
| 138 |
0.8980 |
0.8955 |
0.8980 |
| 139 |
0.8960 |
0.8936 |
0.8960 |
I repeated the tests from #58 to verify the results of the alpaka kernel ports of
commit 2f5cf96971c8 with openMP accelerator.
Setup on Hypnos Laser Nodes
Test Run ALPAKA
Test Run CUDA native with bug fix
Result