You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adding new boxes to the FST-Text would allow the user to split a token. For example, it is especially useful when analyzing social media such as tweets where people will remove the space character between words.
The new box must respect the bounds described in the manual (See 14.5 Text Automaton p. 315). Also if the text automaton is valid, the other boxes' bounds should be updated if needed.
The text was updated successfully, but these errors were encountered:
May be the following observation is useful for this issue.
A token can be composed of many boxes representing agglutinated segments (morphemes) in the same token, in this case the path between segments are dashed line.
varowaAti_haA as an example in Arabic :
"fiy varowaAti_haA"
{fiy,.PREP} _____________ {varowaAti,varowap.N:fpaG} -----------{haA,hu.PRO+Ppers:3fs}
fiy ___________________ varowaAti ----------------------------hA
Adding new boxes to the FST-Text would allow the user to split a token. For example, it is especially useful when analyzing social media such as tweets where people will remove the space character between words.
The new box must respect the bounds described in the manual (See 14.5 Text Automaton p. 315). Also if the text automaton is valid, the other boxes' bounds should be updated if needed.
The text was updated successfully, but these errors were encountered: