Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in construction of POS list #110

Open
mthouv opened this issue Jun 25, 2018 · 0 comments
Open

Bug in construction of POS list #110

mthouv opened this issue Jun 25, 2018 · 0 comments
Assignees
Labels

Comments

@mthouv
Copy link
Collaborator

mthouv commented Jun 25, 2018

The Table panel of the 'FST-text' window contains an 'Export all text as POS list' button that converts the text automaton into a 'POS list' format. This format is a regular expression equivalent to the automaton of the sentence. However there seems to be some inconsistencies in the way the POS list file is created when it comes to compound words.

What steps will reproduce the problem ?

Open the attached French text and click the "Export all text as POS list" button.
80jours.zip

Go check line number 217 in the resulting file and search for the POS sequences corresponding to à cause de.

What is the expected output ?

We should see "à/à/PREP/z1" followed by "(cause/cause/N/z1:fs|cause/causer/V/z1:P1s:P3s:S1s:S3s:Y2s)"

What do you see instead ?

On line 217, we can see the sequences are displayed as followed:

"à/à/PREP/z1 à cause/à cause/ADV/PCDN/z1 (à cause de/à cause de/PREP/EPCPQ/z1|à cause de/à cause de/PREP/PCDN/z1|à cause de/à cause de/PREP/PCDN1/z1|à cause de/à cause/PREP/Prépconjs/1) (cause/cause/N/z1:fs|cause/causer/V/z1:P1s:P3s:S1s:S3s:Y2s)"

This regular expression corresponds to the following text: à à cause à cause de cause.

mthouv added a commit to mthouv/gramlab-ide that referenced this issue Sep 13, 2018
Implementation of Kleene's algorithm in order to transform an automaton into a regular expression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants