Scaling spaCy models to N user discourses #8932

metalaureate · 2021-08-11T14:02:31Z

metalaureate
Aug 11, 2021

Hi! Most spaCy examples seem to imply a single NLP model that is built up over time for a single domain. In my scenario, each user writes a journal and each have their own discourse and need their own daily re-trained spaCy model. I work, as a volunteer, on personal diary / journal app where writers can create 'Small World' language entity linkages between things that only apply to their small world, and wouldn't normally be tagged as entities.

E.g., an author will refer to 'school' or 'work' in their vernacular, and at other times as proper name 'Peabody' or 'Acme', and at other times 'my high school', or 'Peabody High School', or 'Acme Inc'.

Are their any best practices for managing the loading and unloading the spaCy models of thousands of online editors each with their own evolving language model?

metalaureate · 2021-08-24T13:15:09Z

metalaureate
Aug 24, 2021
Author

I figured out how to do this by realizing—by readings the docs!— that you can serialize pipeline components individually, greatly minimizing the overhead of maintaining and mounting one NER and NEL pipeline per author.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling spaCy models to N user discourses #8932

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Scaling spaCy models to N user discourses #8932

metalaureate Aug 11, 2021

Replies: 1 comment

metalaureate Aug 24, 2021 Author

metalaureate
Aug 11, 2021

metalaureate
Aug 24, 2021
Author