You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These custom values should be passed into tokens matched by call: matches = matcher(doc), to be able to distinguish between them based on pattern that matched like so doc[n]._.exclude == True
This would covers multiple cases that were previously hard or impossible to solve with SpaCy matcher:
Matching by preceding tokens
Matching by following tokens
Matching complex pattern of tokens that appear in a constellation to tag them separately.
Cascading match, where you tag items and match again relying on previously tagged entities, but not overwriting them
Other potential cases, that I did not think of, but other could invent, that would benefit from possibility of passing data this way.
Thank you for awesome library – this addition would make it awesome-awesome :)
P.S. Extra credit :)
If we could do matches[n].tokens it would be triple awesome
The text was updated successfully, but these errors were encountered:
apodgorny
changed the title
Feature Request: Add PATTERN_ID option in Matcher pattern definitions
Feature Request: Pass custom values from Matcher to matched tokens
Jun 9, 2024
apodgorny
changed the title
Feature Request: Pass custom values from Matcher to matched tokens
Feature Request: Pass custom values from Matcher pattern definitions to matched tokens
Jun 9, 2024
Discussed in #13519
Originally posted by apodgorny June 5, 2024
Consider a case where I need to tag FAX and TEL separately.
I currently have two options for NER with Matcher:
[{'LOWER': 'tel'}, {'ORTH': ':'}, {PATTERN_TO_MATCH_PHONE}]
[{PATTERN_TO_MATCH_PHONE}]
Neither case accomplishes the goal
SOLUTION:
These custom values should be passed into tokens matched by call:
matches = matcher(doc)
, to be able to distinguish between them based on pattern that matched like sodoc[n]._.exclude == True
This would covers multiple cases that were previously hard or impossible to solve with SpaCy matcher:
Thank you for awesome library – this addition would make it awesome-awesome :)
P.S. Extra credit :)
If we could do
matches[n].tokens
it would be triple awesomeThe text was updated successfully, but these errors were encountered: