Multiple NER models with shared transformer #7412

fcggamou · 2021-03-11T13:25:06Z

fcggamou
Mar 11, 2021

Hi,

Hope you can give me some pointers on how to properly handle my use case.
I'm currently using 4+ different NER models, which are currently trained as independent models, each with its own transformer. But actually, they are all used on the same domain and texts, so I'm thinking they should really be sharing the transformer layer.

Can you give me some general pointers on how should I approach the training? What I'm thinking is:

Train 1 NER model + Transformer.
Freeze the transfomer and train subsequent NER models.

Is this the right approach?

A second question is: ok, I have trained my pipeline which now have multiple NERs, but how do I use it? How will the entity labels work if you have multiple NERs in your pipeline?

Thanks in advance for any kind of feedback.

fcggamou · 2021-03-16T23:29:41Z

fcggamou
Mar 16, 2021
Author

Bump!

0 replies

gonzobrandon · 2021-04-12T20:35:38Z

gonzobrandon
Apr 12, 2021

I'm trying to solve a similar problem. I updated the en_core_web_lg NER system using the project pipeline demo here: https://github.com/explosion/projects/tree/v3/pipelines/ner_demo_update. I havent tried, but I think swapping in en_core_web_trf (if that is what you use) and combining the training and dev data into one might be what you would want?

The training and dev data in this project example is just a JSON file of example sentences with the index/offset of each entity and its entity class. I would think a single base transformer would be specified in the config.yml and you could combine the training data into one.

1 reply

adrianeboyd Apr 13, 2021

Unfortunately the demo project isn't going to work correctly with en_core_web_trf at this point because of the transformer listener (only in the trf models for ner, not in the other models). It will be updated right after v3.0.6 when a bug related to replace_listeners is fixed.

adrianeboyd · 2021-04-13T12:10:19Z

adrianeboyd
Apr 13, 2021

If you don't have merged/joint data with all your entity types to train from or you have overlapping entity types, then training one and freezing the transformer sounds like a sensible approach, since other options will degrade the performance on the first NER component.

In terms of prediction, the built-in NER component is designed to preserve any existing entity spans, so it will modify tokens with unset or O labels, but leave everything else in place. So if you have multiple NER components, the order in which they're applied affects the final results.

If you want each NER component to run on a clean slate, you'd want to write a custom component that gets inserted between NER components that copies the predicted entities to a SpanGroup in doc.spans (new in v3) and resets doc.ents so that the next component doesn't see any of the previous predictions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple NER models with shared transformer #7412

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Multiple NER models with shared transformer #7412

fcggamou Mar 11, 2021

Replies: 3 comments · 1 reply

fcggamou Mar 16, 2021 Author

gonzobrandon Apr 12, 2021

adrianeboyd Apr 13, 2021

adrianeboyd Apr 13, 2021

fcggamou
Mar 11, 2021

Replies: 3 comments 1 reply

fcggamou
Mar 16, 2021
Author

gonzobrandon
Apr 12, 2021

adrianeboyd
Apr 13, 2021