Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion error in faiss #36

Open
dinarior opened this issue Feb 29, 2024 · 5 comments
Open

Assertion error in faiss #36

dinarior opened this issue Feb 29, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@dinarior
Copy link

Very cool project, trying to get it to work I face some issue -

Using local embeddings (INSTRUCTOR_Transformer) and llamacpp model, any search/chat ends up in the following assertion:

load INSTRUCTOR_Transformer
max_seq_length  512
🔎 Enter a search pattern: preprocessing
⠹ 🤖 Processing...Traceback (most recent call last):
  File "/Users/dinari/.local/bin/codeqai", line 10, in <module>
    sys.exit(main())
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/codeqai/__main__.py", line 5, in main
    app.run()
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/codeqai/app.py", line 177, in run
    similarity_result = vector_store.similarity_search(search_pattern)
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/codeqai/vector_store.py", line 131, in similarity_search
    return self.db.similarity_search(query, k=4)
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 544, in similarity_search
    docs_and_scores = self.similarity_search_with_score(
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 417, in similarity_search_with_score
    docs = self.similarity_search_with_score_by_vector(
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 302, in similarity_search_with_score_by_vector
    scores, indices = self.index.search(vector, k if filter is None else fetch_k)
  File "/Users/dinari/Library/Application Support/pipx/venvs/codeqai/lib/python3.10/site-packages/faiss/class_wrappers.py", line 329, in replacement_search
    assert d == self.d
AssertionError

Using apple silicon arm64 arch.

@ghost
Copy link

ghost commented Mar 3, 2024

same with an ubuntu ec2 instance g54xl

💬 Ask anything about the codebase: what is this codebase used for?
⠹ 🤖 Processing...Traceback (most recent call last):
  File "/code/codeqai/bin/codeqai", line 8, in <module>
    sys.exit(main())
  File "/code/codeqai/lib/python3.10/site-packages/codeqai/__main__.py", line 5, in main
    app.run()
  File "/code/codeqai/lib/python3.10/site-packages/codeqai/app.py", line 208, in run
    result = qa(question)
  File "/code/codeqai/lib/python3.10/site-packages/langchain_core/_api/deprecation.py", line 145, in warning_emitting_wrapper
    return wrapped(*args, **kwargs)
  File "/code/codeqai/lib/python3.10/site-packages/langchain/chains/base.py", line 363, in __call__
    return self.invoke(
  File "/code/codeqai/lib/python3.10/site-packages/langchain/chains/base.py", line 162, in invoke
    raise e
  File "/code/codeqai/lib/python3.10/site-packages/langchain/chains/base.py", line 156, in invoke
    self._call(inputs, run_manager=run_manager)
  File "/code/codeqai/lib/python3.10/site-packages/langchain/chains/conversational_retrieval/base.py", line 155, in _call
    docs = self._get_docs(new_question, inputs, run_manager=_run_manager)
  File "/code/codeqai/lib/python3.10/site-packages/langchain/chains/conversational_retrieval/base.py", line 317, in _get_docs
    docs = self.retriever.get_relevant_documents(
  File "/code/codeqai/lib/python3.10/site-packages/langchain_core/retrievers.py", line 224, in get_relevant_documents
    raise e
  File "/code/codeqai/lib/python3.10/site-packages/langchain_core/retrievers.py", line 217, in get_relevant_documents
    result = self._get_relevant_documents(
  File "/code/codeqai/lib/python3.10/site-packages/langchain_core/vectorstores.py", line 663, in _get_relevant_documents
    docs = self.vectorstore.max_marginal_relevance_search(
  File "/code/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 789, in max_marginal_relevance_search
    docs = self.max_marginal_relevance_search_by_vector(
  File "/code/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 724, in max_marginal_relevance_search_by_vector
    docs_and_scores = self.max_marginal_relevance_search_with_score_by_vector(
  File "/code/codeqai/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 602, in max_marginal_relevance_search_with_score_by_vector
    scores, indices = self.index.search(
  File "/code/codeqai/lib/python3.10/site-packages/faiss/__init__.py", line 308, in replacement_search
    assert d == self.d
AssertionError

@fynnfluegge
Copy link
Owner

Thanks for reporting, there is this issue with sentence-transformers PromtEngineer/localGPT#722

Does this issue happen only when using INSTRUCTOR_Transformer together with llama.cpp?

@fynnfluegge fynnfluegge added the bug Something isn't working label Mar 3, 2024
@ghost
Copy link

ghost commented Mar 3, 2024

That was the config I was using, yes. (I believe I had other fatal errors trying to use either of the other sentence transformer options, I will go back and try to recreate.)

I looked at that thread you posted and can try running sentence-transformers locked at 2.2.2

@dinarior
Copy link
Author

dinarior commented Mar 4, 2024

I was using 2.2.2 (injected to the pipx installation), this is the pip list -

aiohttp                   3.9.3
aiosignal                 1.3.1
altair                    5.2.0
annotated-types           0.6.0
anyio                     4.3.0
async-timeout             4.0.3
attrs                     23.2.0
blessed                   1.20.0
blinker                   1.7.0
cachetools                5.3.3
certifi                   2024.2.2
charset-normalizer        3.3.2
click                     8.1.7
codeqai                   0.0.14
dataclasses-json          0.6.4
diskcache                 5.6.3
distro                    1.9.0
editor                    1.6.6
exceptiongroup            1.2.0
faiss-cpu                 1.7.4
filelock                  3.13.1
frozenlist                1.4.1
fsspec                    2024.2.0
gitdb                     4.0.11
GitPython                 3.1.42
h11                       0.14.0
httpcore                  1.0.4
httpx                     0.27.0
huggingface-hub           0.21.3
idna                      3.6
importlib-metadata        7.0.1
inquirer                  3.2.4
InstructorEmbedding       1.0.1
Jinja2                    3.1.3
joblib                    1.3.2
jsonpatch                 1.33
jsonpointer               2.4
jsonschema                4.21.1
jsonschema-specifications 2023.12.1
langchain                 0.1.5
langchain-community       0.0.17
langchain-core            0.1.23
langchain-openai          0.0.5
langsmith                 0.0.87
llama_cpp_python          0.2.53
markdown-it-py            3.0.0
MarkupSafe                2.1.5
marshmallow               3.21.0
mdurl                     0.1.2
mpmath                    1.3.0
multidict                 6.0.5
mypy-extensions           1.0.0
networkx                  3.2.1
nltk                      3.8.1
numpy                     1.26.4
openai                    1.13.3
packaging                 23.2
pandas                    2.2.1
pillow                    10.2.0
pip                       24.0
protobuf                  4.25.3
pyarrow                   15.0.0
pydantic                  2.6.3
pydantic_core             2.16.3
pydeck                    0.8.1b0
Pygments                  2.17.2
python-dateutil           2.8.2
python-dotenv             1.0.1
pytz                      2024.1
PyYAML                    6.0.1
readchar                  4.0.5
referencing               0.33.0
regex                     2023.12.25
requests                  2.31.0
rich                      13.7.1
rpds-py                   0.18.0
runs                      1.2.2
safetensors               0.4.2
scikit-learn              1.4.1.post1
scipy                     1.12.0
sentence-transformers     2.2.2
sentencepiece             0.2.0
setuptools                65.5.0
six                       1.16.0
smmap                     5.0.1
sniffio                   1.3.1
SQLAlchemy                2.0.27
streamlit                 1.31.1
sympy                     1.12
tenacity                  8.2.3
termcolor                 2.4.0
threadpoolctl             3.3.0
tiktoken                  0.5.2
tokenizers                0.15.2
toml                      0.10.2
toolz                     0.12.1
torch                     2.2.1
torchvision               0.17.1
tornado                   6.4
tqdm                      4.66.2
transformers              4.38.1
tree-sitter               0.20.4
tree-sitter-languages     1.10.2
typing_extensions         4.10.0
typing-inspect            0.9.0
tzdata                    2024.1
tzlocal                   5.2
urllib3                   2.2.1
validators                0.22.0
wcwidth                   0.2.13
xmod                      1.8.1
yarl                      1.9.4
yaspin                    3.0.1
zipp                      3.17.0

Switching to other tokenizer this error did no occur.

@umbrellateng
Copy link

I also encountered this problem, when I use the langchain framework to create a AutoGPT agent, and the model used is glm-4

> Entering new LLMChain chain... /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/mpnet/modeling_mpnet.py:1054: UserWarning: cumsum_out_mps supported by MPS on MacOS 13+, please upgrade (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UnaryOps.mm:425.) incremental_indices = torch.cumsum(mask, dim=1).type_as(mask) * mask Traceback (most recent call last): File "/Users/apple/dev/src/github.com/umbrellateng/AGILearn/early/auto_gpt.py", line 63, in <module> auto_gpt_learn() File "/Users/apple/dev/src/github.com/umbrellateng/AGILearn/early/auto_gpt.py", line 59, in auto_gpt_learn agent.run(["Write a weather report for Beijing today"]) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_experimental/autonomous_agents/autogpt/agent.py", line 93, in run assistant_reply = self.chain.run( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/_api/deprecation.py", line 145, in warning_emitting_wrapper return wrapped(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 550, in run return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[ File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/_api/deprecation.py", line 145, in warning_emitting_wrapper return wrapped(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 378, in __call__ return self.invoke( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 163, in invoke raise e File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 153, in invoke self._call(inputs, run_manager=run_manager) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 103, in _call response = self.generate([inputs], run_manager=run_manager) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 112, in generate prompts, stop = self.prep_prompts(input_list, run_manager=run_manager) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 174, in prep_prompts prompt = self.prompt.format_prompt(**selected_inputs) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/prompts/chat.py", line 535, in format_prompt messages = self.format_messages(**kwargs) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_experimental/autonomous_agents/autogpt/prompt.py", line 76, in format_messages relevant_docs = memory.get_relevant_documents(str(previous_messages[-10:])) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/retrievers.py", line 245, in get_relevant_documents raise e File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/retrievers.py", line 238, in get_relevant_documents result = self._get_relevant_documents( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/vectorstores.py", line 674, in _get_relevant_documents docs = self.vectorstore.similarity_search(query, **self.search_kwargs) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 530, in similarity_search docs_and_scores = self.similarity_search_with_score( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 403, in similarity_search_with_score docs = self.similarity_search_with_score_by_vector( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_community/vectorstores/faiss.py", line 304, in similarity_search_with_score_by_vector scores, indices = self.index.search(vector, k if filter is None else fetch_k) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/faiss/class_wrappers.py", line 329, in replacement_search assert d == self.d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants