Skip to content

Commit 0772649

Browse files
committed
Accept SUGGEST instead of &SUGGEST
also SUGGESTWF instead of &SUGGESTWF cf. #73
1 parent e434139 commit 0772649

34 files changed

+577
-577
lines changed

README.md

+30-30
Original file line numberDiff line numberDiff line change
@@ -708,11 +708,11 @@ interface.
708708

709709
We can add a suggestion as well with a `COPY` rule:
710710

711-
COPY:msyn-hallan (Inf &SUGGEST) EXCEPT (Imprt Pl1 Dial/-KJ) TARGET (Imprt Pl1 Dial/-KJ &real-hallan) ;
711+
COPY:msyn-hallan (Inf SUGGEST) EXCEPT (Imprt Pl1 Dial/-KJ) TARGET (Imprt Pl1 Dial/-KJ &real-hallan) ;
712712

713713
This creates a new reading where the tags `Imprt Pl1 Dial/-KJ` have
714-
been changed into `Inf &SUGGEST` (and other tags are unchanged). The
715-
`&SUGGEST` tag is necessary to get `divvun-suggest` (the `<suggest>`
714+
been changed into `Inf SUGGEST` (and other tags are unchanged). The
715+
`SUGGEST` tag is necessary to get `divvun-suggest` (the `<suggest>`
716716
module) to try to generate a form from that reading. It is smart
717717
enough to skip things like weights, tracing and syntax tags when
718718
trying to suggest, but all morphological tags need to be correct and
@@ -824,7 +824,7 @@ instead of deleting the word "dego" to the left, we should change the
824824
case of the word "lávvomuorran" from essive to nominative case:
825825

826826
ADD (&syn-dego-nom) TARGET Ess IF (-1 ("dego"));
827-
COPY (Sg Nom &SUGGEST) EXCEPT (Ess) TARGET (&syn-dego-nom) ;
827+
COPY (Sg Nom SUGGEST) EXCEPT (Ess) TARGET (&syn-dego-nom) ;
828828

829829
Here we want to keep the suggestions for `&syn-dego-nom` separate from
830830
the suggestions for `&syn-not-dego` – in particular, we don't want to
@@ -836,12 +836,12 @@ same time. But if we use the above rules, CG gives us this output:
836836
:
837837
"<lávvomuorran>"
838838
"lávvomuorra" N Ess @COMP-CS< &syn-not-dego ID:12 R:DELETE1:11
839-
"lávvomuorra" N Sg Nom @COMP-CS< &syn-dego-nom ID:12 R:DELETE1:11 &SUGGEST
839+
"lávvomuorra" N Sg Nom @COMP-CS< &syn-dego-nom ID:12 R:DELETE1:11 SUGGEST
840840

841841
Notice how the DELETE relation is on both readings, and also how how
842842
the relation target id (`11`) refers to a cohort, not a reading of a
843843
cohort. There is no way from this output to know that "dego" should
844-
not also be deleted from the `&SUGGEST` reading.
844+
not also be deleted from the `SUGGEST` reading.
845845

846846
So when there are such multiple alternative interpretations for errors
847847
spanning multiple words, the less central parts ("dego" above) need a
@@ -873,7 +873,7 @@ changed to "boahtit" (infinitive). Alternatively, only the first part
873873
is changed and the second part remains unchanged. In this case we can
874874
change the "soaitá" (3.Sg.) to the adverb "kánske".
875875

876-
As usual, this requires `&SUGGEST` readings for the parts that are two
876+
As usual, this requires `SUGGEST` readings for the parts that are two
877877
be changed, and one unique error tag for each interpretation, ie.
878878
`&msyn-kánske` for the "Kánske boađán" correction and
879879
`&msyn-fin_fin-fin_inf` for the "Soaittán boahtit" correction.
@@ -966,7 +966,7 @@ Then you can first of all turn that blanktag tag into an error tag with
966966

967967
Now, we could just suggest a wordform on the comma and call it a day:
968968

969-
COPY ("<, >" &SUGGESTWF) TARGET ("," &no-space-after-punct-mark) ;
969+
COPY ("<, >" SUGGESTWF) TARGET ("," &no-space-after-punct-mark) ;
970970

971971
but that will
972972

@@ -996,7 +996,7 @@ word is a "link" word. In the above rules,
996996

997997
Then we can add a suggestion that puts a space between the forms:
998998

999-
COPY:no-space-after-punct ("<$1 $2>"v &SUGGESTWF)
999+
COPY:no-space-after-punct ("<$1 $2>"v SUGGESTWF)
10001000
TARGET ("<(.*)>"r &no-space-after-punct-mark)
10011001
IF (1 ("<(.*)>"r))
10021002
(NOT 0 (co&no-space-after-punct-mark))
@@ -1014,7 +1014,7 @@ We don't put a suggestion-tag on the `co&` cohort (here the word
10141014
`<ja>`), which would lead to some strange suggestions since it is
10151015
already part of the suggestion-tag on the comma `<,>` cohort. See
10161016
[How underlines and replacements are built](#orgb25740d) for more
1017-
on the relationship between `&SUGGESTWF` and replacements.
1017+
on the relationship between `SUGGESTWF` and replacements.
10181018

10191019
Now the output is
10201020

@@ -1024,7 +1024,7 @@ Now the output is
10241024
"3" Num Arab Sg Ill Attr @HNOUN
10251025
"<,>"
10261026
"," CLB <NoSpaceAfterPunctMark> &no-space-after-punct-mark ID:3 R:RIGHT:4
1027-
"," CLB <NoSpaceAfterPunctMark> "<, ja>" &no-space-after-punct-mark &SUGGESTWF ID:3 R:RIGHT:4
1027+
"," CLB <NoSpaceAfterPunctMark> "<, ja>" &no-space-after-punct-mark SUGGESTWF ID:3 R:RIGHT:4
10281028
"<ja>"
10291029
"ja" CC @CNP co&no-space-after-punct-mark ID:4
10301030

@@ -1126,17 +1126,17 @@ Note that the readings added by the speller don't include any error
11261126
tags (tags with `&` in front). To turn these readings into error
11271127
underlines and actually show the suggestions, add a rule like
11281128

1129-
ADD (&typo &SUGGESTWF) (<spelled>) ;
1129+
ADD (&typo SUGGESTWF) (<spelled>) ;
11301130

1131-
to the grammar checker CG. The reason we add `&SUGGESTWF` and not
1132-
`&SUGGEST` is that we're using the wordform-tag directly as the
1131+
to the grammar checker CG. The reason we add `SUGGESTWF` and not
1132+
`SUGGEST` is that we're using the wordform-tag directly as the
11331133
suggestion, and not sending each analysis through the generator (as
1134-
`&SUGGEST` would do). See also the next section on how replacements
1134+
`SUGGEST` would do). See also the next section on how replacements
11351135
are built. So if, after disambiguation and grammarchecker CG's, we had
11361136

11371137
"<coffes>"
1138-
"coffee" N Pl <W:37.3018> <WA:17.3018> <spelled> "<coffees>" &typo &SUGGESTWF
1139-
"coffer" N Pl <W:39.1010> <WA:17.3018> <spelled> "<coffers>" &typo &SUGGESTWF
1138+
"coffee" N Pl <W:37.3018> <WA:17.3018> <spelled> "<coffees>" &typo SUGGESTWF
1139+
"coffer" N Pl <W:39.1010> <WA:17.3018> <spelled> "<coffers>" &typo SUGGESTWF
11401140

11411141
then the final `divvun-suggest` step would simply use the contents of
11421142
the tags
@@ -1181,38 +1181,38 @@ different parts of the error](#orge26043f) for more info on this.
11811181
By default, *a cohort's word form is used to construct the
11821182
replacement*. So if we have the sentence "we was" where "was" is
11831183
**central** and tagged `&typo`, and there's a `LEFT` relation to "we",
1184-
then the default replacement if there were no `&SUGGEST` tags would
1184+
then the default replacement if there were no `SUGGEST` tags would
11851185
simply be the input "we was" (which would be filtered out since it's
11861186
equal, giving no suggestions).
11871187

1188-
If we now add a `&SUGGEST` reading on "we" that generates "he" then we
1189-
get a "he was" suggestion. `&SUGGEST` readings with matching
1188+
If we now add a `SUGGEST` reading on "we" that generates "he" then we
1189+
get a "he was" suggestion. `SUGGEST` readings with matching
11901190
(co-)error tags are prioritised over input word form.
11911191

1192-
If we also have a `&SUGGEST` for was→are for the possible replacment
1192+
If we also have a `SUGGEST` for was→are for the possible replacment
11931193
"we are" (tagged `&agr`) – now we don't want both of these to apply at
11941194
the same time giving *"we is". In this case, we need to ensure we have
1195-
disambiguating `co&errtype` tags on the `&SUGGEST` readings. The
1195+
disambiguating `co&errtype` tags on the `SUGGEST` readings. The
11961196
following CG parse:
11971197

11981198
"<we>"
11991199
"we" Prn &agr ID:1 R:RIGHT:2
1200-
"he" Prn &SUGGEST co&agr-typo ID:1 R:RIGHT:2
1200+
"he" Prn SUGGEST co&agr-typo ID:1 R:RIGHT:2
12011201
:
12021202
"<was>"
12031203
"be" V 3Sg &agr-typo ID:2 R:LEFT:1
1204-
"be" V 3Pl co&agr &SUGGEST ID:2 R:LEFT:1
1204+
"be" V 3Pl co&agr SUGGEST ID:2 R:LEFT:1
12051205

12061206
will give us all and only the suggestions we want ("he was" and "we
12071207
were", but not *"he were").
12081208

12091209
There is one exception to the above principles; for
1210-
backwards-compatibility, `&SUGGESTWF` is still used to mean that the
1211-
whole underline should be replaced by what's in `&SUGGESTWF`. This
1212-
means that if you combine `&SUGGESTWF` with `RIGHT/LEFT`, you will not
1210+
backwards-compatibility, `SUGGESTWF` is still used to mean that the
1211+
whole underline should be replaced by what's in `SUGGESTWF`. This
1212+
means that if you combine `SUGGESTWF` with `RIGHT/LEFT`, you will not
12131213
automatically get the word form for the relation target(s) in your
12141214
replacement, you have to construct the whole replacement yourself.
1215-
This also means you cannot combine `&SUGGESTWF` with `&SUGGEST` on
1215+
This also means you cannot combine `SUGGESTWF` with `SUGGEST` on
12161216
other words. (If we ever change how this works, we will have to first
12171217
update many existing CG3 rules.)
12181218

@@ -1233,10 +1233,10 @@ don't conflict with the below special tags.
12331233

12341234
### Tags
12351235

1236-
- `&SUGGEST` on a reading means that `divvun-suggest` should try to
1236+
- `SUGGEST` on a reading means that `divvun-suggest` should try to
12371237
generate this reading into a form for suggestions, using the
12381238
generator FST. See [Simple grammarchecker.cg3 rules](#org0955ce1).
1239-
- `&SUGGESTWF` on a reading means that `divvun-suggest` should use the
1239+
- `SUGGESTWF` on a reading means that `divvun-suggest` should use the
12401240
reading's wordform-tag (e.g. a tag like
12411241

12421242
"<Cupertino>"

src/suggest.cpp

+9-9
Original file line numberDiff line numberDiff line change
@@ -234,17 +234,17 @@ const Reading proc_subreading(const string& line, bool generate_all_readings) {
234234
if (tag == "COERROR") {
235235
r.coerror = true;
236236
}
237+
else if (tag == "&SUGGEST" || tag == "SUGGEST") { // &SUGGEST kept for backward-compatibility
238+
r.suggest = true;
239+
}
240+
else if (tag == "&SUGGESTWF" || tag == "SUGGESTWF") { // &SUGGESTWF kept for backward-compatibility
241+
r.suggestwf = true;
242+
}
237243
else if (result.empty()) {
238244
gentags.push_back(tag);
239245
}
240246
else if (result[2].length() != 0) {
241-
if (tag == "&SUGGEST") {
242-
r.suggest = true;
243-
}
244-
else if (tag == "&SUGGESTWF") {
245-
r.suggestwf = true;
246-
}
247-
else if (tag == "&ADDED" || tag == "&ADDED-AFTER-BLANK") {
247+
if (tag == "&ADDED" || tag == "&ADDED-AFTER-BLANK") {
248248
r.added = AddedAfterBlank;
249249
}
250250
else if (tag == "&ADDED-BEFORE-BLANK") {
@@ -612,7 +612,7 @@ if(verbose) std::cerr << "\t\033[0;35mr.suggest=" << tr.suggest << "\033[0m" <
612612
reps_suggestwf.push_back(fromUtf8(withCasing(tr.fixedcase, casing, sf)));
613613
}
614614
else {
615-
std::cerr << "divvun-suggest: WARNING: Saw &SUGGESTWF on non-central (co-)cohort, ignoring" << std::endl;
615+
std::cerr << "divvun-suggest: WARNING: Saw SUGGESTWF on non-central (co-)cohort, ignoring" << std::endl;
616616
}
617617
}
618618
if(verbose) std::cerr << "\t\t\033[1;36msform=\t'" << sf << "'\033[0m" << std::endl;
@@ -719,7 +719,7 @@ variant<Nothing, Err> Suggest::cohort_errs(const ErrId& err_id, size_t i_c,
719719
UStringVector rep;
720720
for (const Reading& r : c.readings) {
721721
if(r.errtypes.find(err_id) == r.errtypes.end()) {
722-
continue; // We consider sforms of &SUGGEST readings in build_squiggle_replacement
722+
continue; // We consider sforms of SUGGEST readings in build_squiggle_replacement
723723
}
724724
// If there are LEFT/RIGHT added relations, add suggestions with those concatenated to our form
725725
// TODO: What about our current suggestions of the same error tag? Currently just using wordform

src/suggest.hpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,7 @@ struct Reading {
174174
StringVector sforms;
175175
relations rels; // rels[relname] = target.id
176176
rel_id id = 0; // id is 0 if unset, otherwise the relation id of this word
177-
string wf; // tag of type "wordform"S for use with &SUGGESTWF
177+
string wf; // tag of type "wordform"S for use with SUGGESTWF
178178
bool suggestwf = false;
179179
bool coerror = false; // cohorts that are not the "core" of the underline never become Err's; message template offsets refer to the cohort of the Err
180180
Added added = NotAdded;

0 commit comments

Comments
 (0)