Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: segfault on clustering #529

Open
2 of 3 tasks
vibl opened this issue Nov 8, 2024 · 2 comments
Open
2 of 3 tasks

Bug: segfault on clustering #529

vibl opened this issue Nov 8, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@vibl
Copy link

vibl commented Nov 8, 2024

Describe the bug

I get a segmentation fault on index.cluster(), whatever the parameters min_count and max_count I use (and without parameters).

Small dataset of 80,000 vectors. index.search works great.

Steps to reproduce

    index = Index(
        ndim=768,
        metric='cos',
        dtype='f32'
    )

    index.save(usearch_index_path)
    index = Index.restore(usearch_index_path)
    
    clustering = index.cluster(min_count=10, max_count=15, log=True)

Expected behavior

It should return a Clustering instance.

USearch version

2.16.2

Operating System

Ubuntu 24.04

Hardware architecture

x86

Which interface are you using?

Python bindings

Contact Details

No response

Are you open to being tagged as a contributor?

  • I am open to being mentioned in the project .git history as a contributor

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@vibl vibl added the bug Something isn't working label Nov 8, 2024
@ashvardanian
Copy link
Contributor

Hi @vibl! Sorry for delayed response! Can you check out the global clustering functionality as opposed to the built-in into the Index Python class? It should work much better.

@m12sl
Copy link

m12sl commented Nov 29, 2024

Hi!
I have the same problem.
.load from file and segfault on clustering.

In my case segfault is observing on this line:

results = self._compiled.cluster_keys(

Can you check out the global clustering functionality as opposed to the built-in into the Index Python class?

Could you please suggest how to do this?
some_index.cluster() provides quite good results. But I don't understand how to use Clustering class, especially how to form batch_matches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants