How to Load Any HuggingFace Model in spaCy #10768
Locked
polm
started this conversation in
Help: Best practices
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
spaCy's wrapper for the HuggingFace Transformers library allows you to specify any model on the HuggingFace Hub. The model will automatically be downloaded the first time it's used and wired up in the pipeline as a feature source. This post will show you how to do that in the config file or in code. Before we get started, though, it's important to keep in mind two limitations of HuggingFace Models in spaCy:
roberta-base
(even if you're not working in English).Using the Config
In the config, specifying an arbitrary model is easy. Your config should have a section like
[components.transformer.model]
, which by default will look like this:You can use a different model just by changing the
name
parameter. The name can be the name of any model on the HuggingFace Hub, or a local path. Note that if you have any other components that rely on your Transformer, you will need to re-train your pipeline after doing this - you can't just change the name and do inference again.Using Code
Using code to load an arbitrary Transformer isn't complicated, but it does require a little more care than modifying the config. The basic steps are the same as for other components in spaCy that use the initialize step to load external data:
After the model is loaded during the initialize step, the transformer name and tranformer/tokenizer settings provided by the config are not used again. The full transformer weights and configs are saved in an internal Thinc format when you call
nlp.to_disk
. Strictly following the spaCy config design, these settings belong in the[initialize]
block instead of the[components]
block. This is a design we would like to change but are keeping for backwards compatibility reasons; for more details see #10613 and #10579.Beta Was this translation helpful? Give feedback.
All reactions