Improve accuracy of 360 stitching memory estimation #363

yousefmoazzam · 2024-06-11T13:28:50Z

Fixes #354

The main changes to the memory estimation are:

the data shape change is accounted for (in particular, the sino width changes)
several copies of a subset of the input array are accounted for

Acceptance criteria checklist:

Newly introduced memory hook tests for sino_360_to_180 pass
The variable names in the memory estimation code + code comments when connecting to the method code are understandable to other devs

Note that the memory hook tests added for `sino_360_to_180` are failing, so further tweaks are needed.

dkazanc · 2024-06-13T14:05:28Z

Thanks, all seem clear me. Pretty memory-hungry one actually.

yousefmoazzam · 2024-06-13T14:16:02Z

One pretty simple way to reduce the amount of memory used is to make the weights array be float32 rather than float64; doing that would halve the amount of memory used by those 4 copies mentioned in the code comments.

I'll make an issue in the httomolibgpu repo so then we can address that at some point 🙂

dkazanc · 2024-06-13T14:41:01Z

One pretty simple way to reduce the amount of memory used is to make the weights array be float32 rather than float64; doing that would halve the amount of memory used by those 4 copies mentioned in the code comments.

I'll make an issue in the httomolibgpu repo so then we can address that at some point 🙂

Thanks, there is no reason to keep it 64bit.

dkazanc · 2024-07-09T14:40:24Z

FYI, I'm fixing the data type thing on the httomolibgpu side of things + adding a test. Wasn't there apparently. Then we can merge

yousefmoazzam · 2024-07-09T15:24:17Z

Cool, thanks! When that's done I'll change stuff here to reflect the lower memory usage, and will check the memory hook test is still passing.

dkazanc · 2024-07-09T15:39:56Z

@yousefmoazzam memory hook test now pass (couldn't run 2160-80 as OOM on this machine), feel free to add/edit anything and merge with this one?

yousefmoazzam · 2024-07-09T15:53:02Z

Ok that's fine, the 2160-80 passes on a wilson node, which is good.

I'll also run the 360 pipeline involving a recon method (gridrec I think?) which originally brought us to this problem, to verify it all works with that small change in the context of a proper run, and then will merge both PR's if all looks good.

yousefmoazzam · 2024-07-10T10:49:05Z

Annoyingly, I'm getting a segfault in the reslice for some reason when running the pipeline bench_recon_gridrec_cpu360.yaml with 4 MPI processes, so I can't verify if it works in the context of a 4 process run.

With 2 MPI processes things are all fine (no segfault in reslice, and the sino_360_to_180 max slices estimaiton must be working ok since there's no CUDA OOM error).

I feel like that's sufficient evidence that this is working fine (though further investigation into why segfault in reslice is happening once more for 360 data should be done...), what do you think @dkazanc?

dkazanc · 2024-07-10T13:16:23Z

yes, certainly. We need to know why it is happening. Does it also happen for the GPU pipeline with 360 to 180 conversion or just the CPU one?

dkazanc · 2024-08-08T17:05:42Z

Adding this to be aware of:
When using gridrec the segfault or a "stall" can happen because gridrec relies on FFTs and it is RAM-hungry method. The GPU function sino_360_to_180 will define the number of slices to process and pass them to gridrec. This can result in a memory overflow on P100 nodes. It does work on the nodes where more memory available.
It is preferable to use FBP GPU method where possible.

yousefmoazzam added 3 commits June 11, 2024 14:22

Add initial memory estimator for sino_360_to_180

3ebca27

Note that the memory hook tests added for `sino_360_to_180` are failing, so further tweaks are needed.

Account for array copies in 360 sino stitching memory estimator

371b34b

Add det x parameterisation to sino stitching memory hook test

6e066b8

float32 for weights in memory estimator

e747ac7

dkazanc changed the base branch from main to dev July 11, 2024 10:50

Base automatically changed from dev to main August 8, 2024 14:13

Merge branch 'main' into 360-stitching-memory-estimation

c9020e0

dkazanc merged commit 25945bb into main Aug 8, 2024
2 of 3 checks passed

dkazanc deleted the 360-stitching-memory-estimation branch August 8, 2024 17:07

yousefmoazzam mentioned this pull request Aug 15, 2024

sino_360_to_180 breaks in httomo if its memory estimator is called DiamondLightSource/httomolibgpu#107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve accuracy of 360 stitching memory estimation #363

Improve accuracy of 360 stitching memory estimation #363

yousefmoazzam commented Jun 11, 2024 •

edited by dkazanc

Loading

dkazanc commented Jun 13, 2024

yousefmoazzam commented Jun 13, 2024

dkazanc commented Jun 13, 2024

dkazanc commented Jul 9, 2024

yousefmoazzam commented Jul 9, 2024

dkazanc commented Jul 9, 2024

yousefmoazzam commented Jul 9, 2024

yousefmoazzam commented Jul 10, 2024

dkazanc commented Jul 10, 2024

dkazanc commented Aug 8, 2024

Improve accuracy of 360 stitching memory estimation #363

Improve accuracy of 360 stitching memory estimation #363

Conversation

yousefmoazzam commented Jun 11, 2024 • edited by dkazanc Loading

dkazanc commented Jun 13, 2024

yousefmoazzam commented Jun 13, 2024

dkazanc commented Jun 13, 2024

dkazanc commented Jul 9, 2024

yousefmoazzam commented Jul 9, 2024

dkazanc commented Jul 9, 2024

yousefmoazzam commented Jul 9, 2024

yousefmoazzam commented Jul 10, 2024

dkazanc commented Jul 10, 2024

dkazanc commented Aug 8, 2024

yousefmoazzam commented Jun 11, 2024 •

edited by dkazanc

Loading