Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow installation without transformers #2252

Open
James-Leslie opened this issue Dec 23, 2024 · 5 comments
Open

Allow installation without transformers #2252

James-Leslie opened this issue Dec 23, 2024 · 5 comments

Comments

@James-Leslie
Copy link

Feature request

Would it be possible to include an installation option where sentence-transformers is not required?
Something like bertopic[no-backend]

Motivation

For my use-case I use the openai embedding models, which use the openai package and (as far I can tell) don't require transformers or sentence-transformers

Your contribution

If approved, I can raise a PR

@MaartenGr
Copy link
Owner

Thank you for the request! I had been exploring this a while ago but couldn't find out a way to make this work. I remember that it wasn't possible for pip to do something like pip install bertopic[no-backend] considering optional dependencies would add on to the base set. So if pip install bertopic contains sentence-transformers, then pip install bertopic[no-backend] by definition should.

If it would be possible now with new pip versions, then that would be perfect!

In practice, I would actually prefer to then replace sentence-transformers with model2vec as a light-weight installation and potentially even without HDBSCAN (but using scikit-learn's version instead).

In other words, could you explain a bit what your suggested approach is?

@JamesLeslieAT
Copy link

Hi Maarten,

Thanks so much for the reply. Ah yes, that makes sense. I guess the only way to achieve the behaviour I am after would be to remove sentence-transformers from the main bertopic dependencies and create an optional bertopic[sentence-transformers version.

The issue then is that the bertopic install on its own doesn't actually contain all the features needed to do the quickstart, unless users specify their backend of choice.

Just going back to my own use-case, I use pre-computed openai embeddings and an openai backend for optional zero-shot modelling, so I am looking for the most lightweight way of installing the bertopic package. sentence-transformers and torch are taking up quite a bit of space in my virtual environments.

@JamesLeslieAT
Copy link

I hadn't heard of it before, but model2vec looks really cool!
I would be all in for that being the main backend, with sentence-transformers being an optional version.

@MaartenGr
Copy link
Owner

Because that would require breaking changes to BERTopic and potentially impact many users, I'm still a bit hesitant to implement that in the next release. Perhaps that would change in future releases, but due to the impact it all becomes a bit more complicated.

Also note that you can already install a light-weight version of BERTopic as shown in the docs here: https://maartengr.github.io/BERTopic/getting_started/tips_and_tricks/tips_and_tricks.html#lightweight-installation

@JamesLeslieAT
Copy link

Amazing! Thanks so much and sorry I didn't come across that section before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants