-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory on GPU not cleared after transcription #992
Comments
can you try that in a loop? i.e. repeating the experiment to see if each model instance created and deleted leaves a residue in memory and not a once-in-a-runtime issue |
For every model (tiny, medium, large-v2, ...), the residue is the same, 312 MiB. I tried using, |
That's likely the memory held by the cuda runtime, which iirc, cant really be freed unless the entire process is killed. Do you notice if loading/unloading the model a few times in a row always results in the same amount of memory remaining? If so, its probably very likely the runtime. |
Yes, always the same amount of memory for all models. |
Hi, I have a use case where after a script that transcribes an audio file still needs to run also after it finished the transcription process.
The question I have is why doesn't the GPU memory get fully cleared after the transcription process is over?
When I try to delete the faster-whisper model object, there seems to be about 312 MiB GPU memory still occupied but I don't know by what. There is a sample screen shot from the
nvtop
command and the code to replicate this behaviour. The "running" after the transcription process is imitated by thetime.sleep(20)
command and during those 20 seconds you can see that the GPU memory is still occupied by those 312 MiB, and gets only released when the script is fully finished.The text was updated successfully, but these errors were encountered: