-
Notifications
You must be signed in to change notification settings - Fork 0
/
howToRemoveAnd.txt
118 lines (94 loc) · 4.72 KB
/
howToRemoveAnd.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
Ignore the prepositional phrases, PP (at least while removing "and")
1. A library issues books and loans to students.
(ROOT
(S
(NP (DT A) (NN library))
(VP (VBZ issues)
(NP (NNS books)
(CC and)
(NNS loans))
(PP (TO to)
(NP (NNS students))))
(. .)))
== Here, the "and" relation is between books and library. Since they are immediately beside "and", do the following:
Get to the CC. Excluding its parent's other children to the left of CC and the CC, copy everything else onto a new parse tree.
2. A library issues books to students and loans to students.
(ROOT
(S
(NP (DT A) (NN library))
(VP (VBZ issues)
(NP (NNS books))
(PP (TO to)
(NP (NNS students)
(CC and)
(NNS loans)))
(PP (TO to)
(NP (NNS students))))
(. .)))
== We get a wrong relation in the above case. Relation should be between books and loans but is shown between students and loans.
Similarly, it does not work for "library issues loans to students and library issues books to students."
3. Issuing of books and loans is done by the library to students.
(ROOT
(S
(NP
(NP (NN Issuing))
(PP (IN of)
(NP
(NP (NNS books)
(CC and)
(NNS loans))
(PP (TO to)
(NP (NNS students))))))
(VP (VBZ is)
(VP (VBN done)
(PP (IN by)
(NP (DT the) (NN library)))))
(. .)))
== Since this is in the passive voice, the exact opposite of the rule for active voice should be applied.
That is, Get to the CC. Excluding its parent's other children to the right of CC and the CC, copy everything else onto a new parse tree.
4. Library issues books to students and issues loans to teachers.
(ROOT
(S
(NP (NNP Library))
(VP
(VP (VBZ issues)
(NP (NNS books))
(PP (TO to)
(NP (NNS students))))
(CC and)
(VP (VBZ issues)
(NP (NNS loans))
(PP (TO to)
(NP (NNS teachers)))))
(. .)))
== Since the "and" is followed by another VBZ, we only need to copy the initial noun phrase (i.e. the phrases before the phrase containing the CC) on the new parse tree. And keep the entire part before the "and" as another sentence.
NOTE: For the sentence "A library issues books to students and loans to teachers." (same as e.g. 2), the Stanford parser gives the wrong AND relationship between "students" and "loans" when it should be "books" and "loans". However, it gives the correct relation for the above example.
5. Library issues books to students and the school has three libraries.
(ROOT
(S
(NP
(NP (JJ Library) (NNS issues) (NNS books))
(PP (TO to)
(NP
(NP (NNS students))
(CC and)
(NP (DT the) (NN school)))))
(VP (VBZ has)
(NP (CD three) (NNS libraries)))
(. .)))
== Since "and" is followed by a noun phrase, directly split the above sentence.
** NOTE: The above output of parser is also wrong, as there should be cconj relation between issues and has, but the parser shows relation between "students" and "the school". However, there is a code written for this.
6. Coaches and players are people.
Steps:
1. Check if there is a 'cc' reln or a 'conj'. Do this by using a list of TypedDependencies which can be produced using a GrammaticalStructure.
Use the method reln() to get the GrammaticalRelation and check if it's shortName() equals "cc" (or "conj").
If yes, then check if the DEPENDENT in the relation is an "and" by using the .dep() method on the dependency.
2. If "and" is found, check if the sentence in the active or the passive voice. Check for the terms 'auxpass' or 'nsubjpass' in the dependencies using the reln() method. If either of these terms are found, then the sentence is in passive voice. Else it is in active voice.
3. For ACTIVE voice,
we will use the rules mentioned above, i.e. Get to the CC. Excluding its parent's other children to the left of CC and the CC, copy everything else onto a new parse tree. Now, excluding its parent's other children to the right of CC and the CC, copy everything to a new parse tree. Then convert the two parse trees into sentences to get the original sentence split in to two sentences without an "and".
[more rules here...]
For now, do this using the tagged output. Find a way to use the tree output.
4. For PASSIVE voice,
we will use the rule mentioned above, i.e. Get to the CC. Excluding its parent's other children to the right of CC and the CC, copy everything else onto a new parse tree. Now, excluding its parent's other children to the left of CC and the CC, copy everything to a new parse tree. Convert them into sentences.
[more rules here...]
For now, do this using the tagged output. Find a way to use the tree output.