Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

进程退出 cleanup 时抛出异常 #1358

Open
patricksuo opened this issue Feb 7, 2025 · 4 comments
Open

进程退出 cleanup 时抛出异常 #1358

patricksuo opened this issue Feb 7, 2025 · 4 comments

Comments

@patricksuo
Copy link

patricksuo commented Feb 7, 2025

环境:

  • Mac silicon
  • Python 3.12.8
  • FlagEmbedding==1.3.3
:Exception ignored in: <function AbsEmbedder.__del__ at 0x11bb863e0>
Traceback (most recent call last):
  File "/xxx/.venv/lib/python3.12/site-packages/FlagEmbedding/abc/inference/AbsEmbedder.py", line 270, in __del__
  File "/xxx/.venv/lib/python3.12/site-packages/FlagEmbedding/abc/inference/AbsEmbedder.py", line 89, in stop_self_pool
TypeError: 'NoneType' object is not callable
@patricksuo
Copy link
Author

patricksuo commented Feb 7, 2025

del() can be executed during interpreter shutdown. As a consequence, the global variables it needs to access (including other modules) may already have been deleted or set to None. Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the del() method is called.

def stop_self_pool(self):
if self.pool is not None:
self.stop_multi_process_pool(self.pool)
self.pool = None
try:
self.model.to('cpu')
torch.cuda.empty_cache()
except:
pass
gc.collect()

在进程退出时,这段 cleanup 代码会被执行,但这时候 gc.collect 已经被设置为 None 了,所以会抛出异常

@hanhainebula
Copy link
Collaborator

你好,@patricksuo,谢谢你提出这个问题!我们的测试中还没有遇到过这类问题,可以麻烦你给一段代码帮助我们复现这个问题吗?谢谢!

@patricksuo
Copy link
Author

patricksuo commented Feb 14, 2025

@hanhainebula
我本地是这样重现的:
1)加载模型
2)把模型放到全局变量
3) 调用 encode query
3)进程退出

from langchain_text_splitters import CharacterTextSplitter
from FlagEmbedding import BGEM3FlagModel


model = BGEM3FlagModel('BAAI/bge-m3', return_dense=True, return_sparse=True)


def search(query:str) :
    query_vector = model.encode_queries([query], batch_size=8, return_dense=True, return_sparse=True, convert_to_numpy=True)['dense_vecs']
    return query_vector
   
search("xxx")

进程退出时 gc.collect 被置为 None

@patricksuo
Copy link
Author

今天再试了一下,比较奇怪,可能不是一个可以确定复现的问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants