adds lprec memory estimation and a test #348

dkazanc · 2024-05-28T16:01:23Z

It also fixes the issue with the FBP memoryhook tests failing.

yousefmoazzam

We know that the memory hook tests for methods using FFT's can be tricky to get passing for all GPUs that are likely to be used with httomo.

The GPU in ws448 is a Quadro RTX 5000, and it's a model which is closer to a V100 than a P100 in terms of how new it is (and thus how it may handle FFT's).

So this is just a note that the failing tests for the Quadro RTX 5000 may mean that the tests would also fail on a V100, which could mean slightly inaccurate memory estimation when running httomo on V100's.

httomo/methods_database/packages/external/httomolibgpu/supporting_funcs/recon/algorithm.py

yousefmoazzam · 2024-05-30T08:54:40Z

httomo/methods_database/packages/external/httomolibgpu/supporting_funcs/recon/algorithm.py

+        3 * in_slice_size
        + filtered_in_data
        + freq_slice
        + fftplan_size
-        + 3.5 * astra_out_size
+        + astra_out_size
+    )
+    return (tot_memory_bytes, filter_size)


On ws448, on the fourier branch, two FBP memory hook tests now fail for me:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (fourier)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook ========================================== test session starts =========================================== platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python cachedir: .pytest_cache benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0 collected 6 items tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-3] PASSED [ 16%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-5] PASSED [ 33%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-8] PASSED [ 50%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] FAILED [ 66%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] FAILED [ 83%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-8] PASSED [100%] ================================================ FAILURES ================================================ ___________________________________ test_recon_FBP_memoryhook[2560-3] ____________________________________ Traceback (most recent call last): File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 449, in test_recon_FBP_memoryhook assert estimated_memory_mb >= max_mem_mb AssertionError: assert 444.47 >= 491.7 ___________________________________ test_recon_FBP_memoryhook[2560-5] ____________________________________ Traceback (most recent call last): File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 449, in test_recon_FBP_memoryhook assert estimated_memory_mb >= max_mem_mb AssertionError: assert 740.78 >= 752.81 ======================================== short test summary info ========================================= FAILED tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] - assert 444.47 >= 4... FAILED tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] - assert 740.78 >= 7... ====================================== 2 failed, 4 passed in 5.11s =======================================

Whereas on main, all FBP memory hook tests were passing:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (main)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook ========================================== test session starts =========================================== platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python cachedir: .pytest_cache benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0 collected 6 items tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-3] PASSED [ 16%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-5] PASSED [ 33%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-8] PASSED [ 50%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] PASSED [ 66%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] PASSED [ 83%] tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-8] PASSED [100%] =========================================== 6 passed in 4.03s ============================================

yes, as we discussed, this multiplier is speculative as I had to increase it for larger data during benchmarks. Let us start with the memory estimator for it in a separate branch.

Ok, I'm happy for the FBP memory estimation to be dealt with in a separate PR. If so, shall we just leave the FBP memory estimation alone in this PR and only have LPRec changes here?

Ie, shall we get rid of the FBP memory estimation changes here and put them in the PR for FBP memory estimation, since the changes to FBP memory estimation here are "incomplete" anyway?

yep, sounds good. We should remove the FBP part. Sorry I thought it would be a quick fix so I put it here.

No problem, this time I remembered to run the tests before reviewing so it was easy enough to catch, compared to the past times where I forgot to run the tests... 😅

yousefmoazzam · 2024-05-30T08:57:46Z

tests/test_backends/test_httomolibgpu.py

+
+@pytest.mark.cupy
+@pytest.mark.parametrize("slices", [4, 7])
+def test_recon_LPRec_memoryhook(slices, ensure_clean_memory):


On ws448, one of the two LPRec memory hook tests seems to fail:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (fourier)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook ========================================== test session starts =========================================== platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python cachedir: .pytest_cache benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0 collected 2 items tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[4] FAILED [ 50%] tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[7] PASSED [100%] ================================================ FAILURES ================================================ _____________________________________ test_recon_LPRec_memoryhook[4] _____________________________________ Traceback (most recent call last): File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 488, in test_recon_LPRec_memoryhook assert percents_relative_maxmem <= 20 AssertionError: assert 23 <= 20 ======================================== short test summary info ========================================= FAILED tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[4] - assert 23 <= 20 ====================================== 1 failed, 1 passed in 3.66s =======================================

overall, I think the memory estimator is far from ideal so we can delay with that PR for now. I just wanted to show the potential during the benchmarks. Will come back to it once again after dealing with FBP. thx

adds lprec memory estimation

eb6a7de

yousefmoazzam reviewed May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds lprec memory estimation and a test #348

adds lprec memory estimation and a test #348

dkazanc commented May 28, 2024 •

edited

Loading

yousefmoazzam left a comment •

edited

Loading

yousefmoazzam May 30, 2024

dkazanc Jun 4, 2024

yousefmoazzam Jun 4, 2024

dkazanc Jun 4, 2024

yousefmoazzam Jun 4, 2024

yousefmoazzam May 30, 2024

dkazanc Jun 4, 2024

adds lprec memory estimation and a test #348

Are you sure you want to change the base?

adds lprec memory estimation and a test #348

Conversation

dkazanc commented May 28, 2024 • edited Loading

yousefmoazzam left a comment • edited Loading

Choose a reason for hiding this comment

yousefmoazzam May 30, 2024

Choose a reason for hiding this comment

dkazanc Jun 4, 2024

Choose a reason for hiding this comment

yousefmoazzam Jun 4, 2024

Choose a reason for hiding this comment

dkazanc Jun 4, 2024

Choose a reason for hiding this comment

yousefmoazzam Jun 4, 2024

Choose a reason for hiding this comment

yousefmoazzam May 30, 2024

Choose a reason for hiding this comment

dkazanc Jun 4, 2024

Choose a reason for hiding this comment

dkazanc commented May 28, 2024 •

edited

Loading

yousefmoazzam left a comment •

edited

Loading