Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error trying to run TS1 model #56

Open
fedorov opened this issue Apr 26, 2024 · 6 comments
Open

Runtime error trying to run TS1 model #56

fedorov opened this issue Apr 26, 2024 · 6 comments

Comments

@fedorov
Copy link

fedorov commented Apr 26, 2024

I tried to run the TS1 model using the extension from Slicer 5.7.0-2024-04-24 r32828 / 4a60ea6 on the CT series from the NLST collection and ran into the error below. Is this expected? What are the RAM requirements for this model?

Downloading model: 306.4MB / 309.0MB (99.2%)
Download finished. Extracting to /home/exouser/.MONAIAuto3DSeg/models/whole-body-v1.0.0...
Cleaning up temporary model download folder...
Processing started
Writing input file to /tmp/Slicer-exouser/__SlicerTemp__2024-04-26_21+59+07.664/input-volume0.nrrd
Creating segmentations with MONAIAuto3DSeg AI...
Auto3DSeg command: ['/home/exouser/Desktop/Slicer-5.7.0-2024-04-24-linux-amd64/bin/../bin/PythonSlicer', '/home/exouser/Desktop/Slicer-5.7.0-2024-04-24-linux-amd64/slicer.org/Extensions-32828/MONAIAuto3DSeg/lib/Slicer-5.7/qt-scripted-modules/Scripts/auto3dseg_segresnet_inference.py', '--model-file', '/home/exouser/.MONAIAuto3DSeg/models/whole-body-v1.0.0/model.pt', '--image-file', '/tmp/Slicer-exouser/__SlicerTemp__2024-04-26_21+59+07.664/input-volume0.nrrd', '--result-file', '/tmp/Slicer-exouser/__SlicerTemp__2024-04-26_21+59+07.664/output-segmentation.nrrd']
`apex.normalization.InstanceNorm3dNVFuser` is not installed properly, use nn.InstanceNorm3d instead.
Model epoch 492 metric 0.7895059585571289
Using crop_foreground
Using resample with  resample_resolution [1.5, 1.5, 1.5]
Running Inference ...

  0%|          | 0/12 [00:00<?, ?it/s]Applied workaround for CuDNN issue, install nvrtc.so (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:84.)
Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)

  0%|          | 0/12 [00:00<?, ?it/s]
2024-04-26 21:59:14,440 - INFO - CUDA out of memory. Tried to allocate 5.02 GiB. GPU
2024-04-26 21:59:14,440 - WARNING - GPU stitching failed, buffer 1 dim -1, image dim torch.Size([1, 1, 254, 243, 208]).

0it [00:00, ?it/s]
0it [00:00, ?it/s]
2024-04-26 21:59:15,049 - INFO - CUDA out of memory. Tried to allocate 2.83 GiB. GPU
2024-04-26 21:59:15,049 - WARNING - GPU buffered stitching failed, attempting on CPU, image dim torch.Size([1, 1, 254, 243, 208]).

Processing failed with return code 1
Cleaning up temporary folder.
Processing failed after 11.65 seconds.

Processing finished.
image

You can access the same CT series that I used by installing SlicerIDCBrowser extension, and plugging this 1.2.840.113654.2.55.252662084823127974216855931259749568803 into the SeriesInstanceUID of the downloader section of the UI.

image
@diazandr3s
Copy link
Collaborator

diazandr3s commented Apr 27, 2024

Hi @fedorov,

Thanks for your feedback.

I've downloaded the sample and performed inference. Here is the result: https://github.com/lassoan/SlicerMONAIAuto3DSeg/assets/11991079/30065149-5775-4b6f-9b20-26ef9c7c5156

I see the volume you're referring to is 512x512x156 with a spacing of 0.7x0.7x2mm - After resampling to 1.5mm, it becomes: 254x 243x208

It took around 25 seconds and 17GB of GPU memory to run. Which spec does your machine have? Can you please try on the CPU to check if it runs on your end?

BTW, this is the very first version of the whole-body CT segmentation model on TSV1. I'm working on getting the second version - potentially more accurate :)

Thanks!

@fedorov
Copy link
Author

fedorov commented Apr 29, 2024

Here are the specs

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  GRID A100X-8C                  On  | 00000000:04:00.0 Off |                    0 |
| N/A   N/A    P0              N/A /  N/A |      1MiB /  8192MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Some other models I tried on that same volume worked without errors. Below is "Whole body segmentation - TS1 quick".

2024-04-28_22-36-19

I tried on CPU, and it errored as well:

Processing started
Writing input file to /tmp/Slicer-exouser/__SlicerTemp__2024-04-29_02+33+26.814/input-volume0.nrrd
Creating segmentations with MONAIAuto3DSeg AI...
Auto3DSeg command: ['/home/exouser/Desktop/Slicer-5.7.0-2024-04-24-linux-amd64/bin/../bin/PythonSlicer', '/home/exouser/Desktop/Slicer-5.7.0-2024-04-24-linux-amd64/slicer.org/Extensions-32828/MONAIAuto3DSeg/lib/Slicer-5.7/qt-scripted-modules/Scripts/auto3dseg_segresnet_inference.py', '--model-file', '/home/exouser/.MONAIAuto3DSeg/models/whole-body-v1.0.0/model.pt', '--image-file', '/tmp/Slicer-exouser/__SlicerTemp__2024-04-29_02+33+26.814/input-volume0.nrrd', '--result-file', '/tmp/Slicer-exouser/__SlicerTemp__2024-04-29_02+33+26.814/output-segmentation.nrrd']
Additional environment variables: {'CUDA_VISIBLE_DEVICES': '-1'}
`apex.normalization.InstanceNorm3dNVFuser` is not installed properly, use nn.InstanceNorm3d instead.
User provided device_type of 'cuda', but CUDA is not available. Disabling
Model epoch 492 metric 0.7895059585571289
Using crop_foreground
Using resample with  resample_resolution [1.5, 1.5, 1.5]
Running Inference ...

  0%|          | 0/12 [00:00<?, ?it/s]
Processing failed with return code 1
Cleaning up temporary folder.
Processing failed after 42.12 seconds.

Processing finished.

It is not in my critical path - I am just reporting in case this helps with your development. No urgency!

@diazandr3s
Copy link
Collaborator

Thanks again for the feedback, @fedorov.
I'm glad to see the quick version worked on your end.
Strangely, the high-resolution model didn't work on the CPU. How much RAM does this instance have?

@fedorov
Copy link
Author

fedorov commented Apr 29, 2024

CPU has 15 GB.

@lassoan
Copy link
Owner

lassoan commented Apr 29, 2024

For a full resolution model, 15GB total CPU RAM can be really tight. What operating system are you using? How much virtual memory is available?

@fedorov
Copy link
Author

fedorov commented Apr 29, 2024

If you need anything other than below, let me know what I should run!

exouser@morally-feasible-reindeer-gpu:~$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 10356748 132452 2903196    0    0    42    15   54   99  1  0 99  0  0

exouser@morally-feasible-reindeer-gpu:~$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants