Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds lprec memory estimation and a test #348

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,15 @@ recon:
- datasets: [tomo]
- multipliers: [None]
- methods: [module]
LPRec:
pattern: sinogram
output_dims_change: True
implementation: gpu_cupy
save_result_default: True
memory_gpu:
- datasets: [tomo]
- multipliers: [None]
- methods: [module]
SIRT:
pattern: sinogram
output_dims_change: True
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,11 @@

__all__ = [
"_calc_memory_bytes_FBP",
"_calc_memory_bytes_LPRec",
"_calc_memory_bytes_SIRT",
"_calc_memory_bytes_CGLS",
"_calc_output_dim_FBP",
"_calc_output_dim_LPRec",
"_calc_output_dim_SIRT",
"_calc_output_dim_CGLS",
]
Expand All @@ -53,6 +55,10 @@ def _calc_output_dim_FBP(non_slice_dims_shape, **kwargs):
return __calc_output_dim_recon(non_slice_dims_shape, **kwargs)


def _calc_output_dim_LPRec(non_slice_dims_shape, **kwargs):
return __calc_output_dim_recon(non_slice_dims_shape, **kwargs)


def _calc_output_dim_SIRT(non_slice_dims_shape, **kwargs):
return __calc_output_dim_recon(non_slice_dims_shape, **kwargs)

Expand Down Expand Up @@ -82,11 +88,82 @@ def _calc_memory_bytes_FBP(
astra_out_size = np.prod(output_dims) * np.float32().itemsize

tot_memory_bytes = int(
2 * in_slice_size
3 * in_slice_size
+ filtered_in_data
+ freq_slice
+ fftplan_size
+ 3.5 * astra_out_size
+ astra_out_size
)
return (tot_memory_bytes, filter_size)
Comment on lines +91 to +97
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ws448, on the fourier branch, two FBP memory hook tests now fail for me:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (fourier)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook
========================================== test session starts ===========================================
platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml
plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0
collected 6 items

tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-3] PASSED                 [ 16%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-5] PASSED                 [ 33%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-8] PASSED                 [ 50%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] FAILED                 [ 66%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] FAILED                 [ 83%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-8] PASSED                 [100%]

================================================ FAILURES ================================================
___________________________________ test_recon_FBP_memoryhook[2560-3] ____________________________________
Traceback (most recent call last):
  File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 449, in test_recon_FBP_memoryhook
    assert estimated_memory_mb >= max_mem_mb
AssertionError: assert 444.47 >= 491.7
___________________________________ test_recon_FBP_memoryhook[2560-5] ____________________________________
Traceback (most recent call last):
  File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 449, in test_recon_FBP_memoryhook
    assert estimated_memory_mb >= max_mem_mb
AssertionError: assert 740.78 >= 752.81
======================================== short test summary info =========================================
FAILED tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] - assert 444.47 >= 4...
FAILED tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] - assert 740.78 >= 7...
====================================== 2 failed, 4 passed in 5.11s =======================================

Whereas on main, all FBP memory hook tests were passing:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (main)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook
========================================== test session starts ===========================================
platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml
plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0
collected 6 items

tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-3] PASSED                 [ 16%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-5] PASSED                 [ 33%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[1200-8] PASSED                 [ 50%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-3] PASSED                 [ 66%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-5] PASSED                 [ 83%]
tests/test_backends/test_httomolibgpu.py::test_recon_FBP_memoryhook[2560-8] PASSED                 [100%]

=========================================== 6 passed in 4.03s ============================================

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, as we discussed, this multiplier is speculative as I had to increase it for larger data during benchmarks. Let us start with the memory estimator for it in a separate branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm happy for the FBP memory estimation to be dealt with in a separate PR. If so, shall we just leave the FBP memory estimation alone in this PR and only have LPRec changes here?

Ie, shall we get rid of the FBP memory estimation changes here and put them in the PR for FBP memory estimation, since the changes to FBP memory estimation here are "incomplete" anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, sounds good. We should remove the FBP part. Sorry I thought it would be a quick fix so I put it here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, this time I remembered to run the tests before reviewing so it was easy enough to catch, compared to the past times where I forgot to run the tests... 😅



def _calc_memory_bytes_LPRec(
non_slice_dims_shape: Tuple[int, int],
dtype: np.dtype,
**kwargs,
) -> Tuple[int, int]:
angles_tot = non_slice_dims_shape[0]
DetectorsLengthH = non_slice_dims_shape[1]
# calculate the output shape
output_dims = _calc_output_dim_LPRec(non_slice_dims_shape, **kwargs)

in_slice_size = np.prod(non_slice_dims_shape) * dtype.itemsize
out_slice_size = np.prod(DetectorsLengthH * DetectorsLengthH) * dtype.itemsize

# interpolation kernels
grid_size = np.prod(DetectorsLengthH * DetectorsLengthH) * np.float32().nbytes
phi = grid_size

eps = 1e-3 # accuracy of usfft
mu = -np.log(eps) / (2 * DetectorsLengthH * DetectorsLengthH)
m = int(
np.ceil(
2
* DetectorsLengthH
* 1
/ np.pi
* np.sqrt(
-mu * np.log(eps)
+ (mu * DetectorsLengthH) * (mu * DetectorsLengthH) / 4
)
)
)
oversampling_level = 2
tmp_oversample_size = (
np.prod(angles_tot * oversampling_level * DetectorsLengthH)
* np.float32().nbytes
)

data_c_size = np.prod(0.5 * angles_tot * DetectorsLengthH) * np.complex64().itemsize

fde_size = (
(2 * m + 2 * DetectorsLengthH) * (2 * m + 2 * DetectorsLengthH)
) * np.complex64().itemsize

fde2_size = (
(2 * DetectorsLengthH) * (2 * DetectorsLengthH)
) * np.complex64().itemsize
yousefmoazzam marked this conversation as resolved.
Show resolved Hide resolved

c2dfftshift_slice_size = (
np.prod(4 * DetectorsLengthH * DetectorsLengthH) * np.int8().nbytes
)

filter_size = (DetectorsLengthH // 2 + 1) * np.float32().itemsize
freq_slice = angles_tot * (DetectorsLengthH + 1) * np.complex64().itemsize
fftplan_size = freq_slice * 2

tot_memory_bytes = int(
in_slice_size
+ out_slice_size
+ 2 * grid_size
+ phi
+ tmp_oversample_size
+ data_c_size
+ fde_size
+ fde2_size
+ c2dfftshift_slice_size
+ freq_slice
+ fftplan_size
)
return (tot_memory_bytes, filter_size)

Expand Down
42 changes: 40 additions & 2 deletions tests/test_backends/test_httomolibgpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
)
from httomolibgpu.prep.stripe import remove_stripe_based_sorting, remove_stripe_ti
from httomolibgpu.misc.corr import remove_outlier
from httomolibgpu.recon.algorithm import FBP, SIRT, CGLS
from httomolibgpu.recon.algorithm import FBP, LPRec, SIRT, CGLS
from httomolibgpu.misc.rescale import rescale_to_int

from httomo.methods_database.packages.external.httomolibgpu.supporting_funcs.misc.morph import *
Expand Down Expand Up @@ -447,7 +447,45 @@ def test_recon_FBP_memoryhook(slices, recon_size_it, ensure_clean_memory):
# the estimated_memory_mb should be LARGER or EQUAL to max_mem_mb
# the resulting percent value should not deviate from max_mem on more than 20%
assert estimated_memory_mb >= max_mem_mb
assert percents_relative_maxmem <= 35
assert percents_relative_maxmem <= 20


@pytest.mark.cupy
@pytest.mark.parametrize("slices", [4, 7])
def test_recon_LPRec_memoryhook(slices, ensure_clean_memory):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ws448, one of the two LPRec memory hook tests seems to fail:

(/dls/science/users/twi18192/conda-envs/httomo) [twi18192@ws448 httomo (fourier)]$ python -m pytest tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook
========================================== test session starts ===========================================
platform linux -- Python 3.10.10, pytest-7.1.2, pluggy-1.3.0 -- /dls/science/users/twi18192/conda-envs/httomo/bin/python
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /dls/science/users/twi18192/httomo, configfile: pyproject.toml
plugins: mpi-0.6, cov-4.1.0, typeguard-3.0.2, benchmark-4.0.0, mock-3.12.0
collected 2 items

tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[4] FAILED                    [ 50%]
tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[7] PASSED                    [100%]

================================================ FAILURES ================================================
_____________________________________ test_recon_LPRec_memoryhook[4] _____________________________________
Traceback (most recent call last):
  File "/dls/science/users/twi18192/httomo/tests/test_backends/test_httomolibgpu.py", line 488, in test_recon_LPRec_memoryhook
    assert percents_relative_maxmem <= 20
AssertionError: assert 23 <= 20
======================================== short test summary info =========================================
FAILED tests/test_backends/test_httomolibgpu.py::test_recon_LPRec_memoryhook[4] - assert 23 <= 20
====================================== 1 failed, 1 passed in 3.66s =======================================

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, I think the memory estimator is far from ideal so we can delay with that PR for now. I just wanted to show the potential during the benchmarks. Will come back to it once again after dealing with FBP. thx

data = cp.random.random_sample((1801, slices, 2560), dtype=np.float32)
kwargs = {}
kwargs["angles"] = np.linspace(
0.0 * np.pi / 180.0, 180.0 * np.pi / 180.0, data.shape[0]
)
kwargs["center"] = 500
kwargs["recon_size"] = 2560
kwargs["recon_mask_radius"] = 0.8

hook = MaxMemoryHook()
with hook:
recon_data = LPRec(cp.copy(data), **kwargs)

# make sure estimator function is within range (80% min, 100% max)
max_mem = (
hook.max_mem
) # the amount of memory in bytes needed for the method according to memoryhook

# now we estimate how much of the total memory required for this data
(estimated_memory_bytes, subtract_bytes) = _calc_memory_bytes_LPRec(
(1801, 2560), dtype=np.float32(), **kwargs
)
estimated_memory_mb = round(slices * estimated_memory_bytes / (1024**2), 2)
max_mem -= subtract_bytes
max_mem_mb = round(max_mem / (1024**2), 2)

# now we compare both memory estimations
difference_mb = abs(estimated_memory_mb - max_mem_mb)
percents_relative_maxmem = round((difference_mb / max_mem_mb) * 100)
# the estimated_memory_mb should be LARGER or EQUAL to max_mem_mb
# the resulting percent value should not deviate from max_mem on more than 20%
assert estimated_memory_mb >= max_mem_mb
assert percents_relative_maxmem <= 20


@pytest.mark.cupy
Expand Down
Loading