Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Referring to phrases, clauses, and other things bigger than words #10

Open
jonathanrobie opened this issue Mar 28, 2022 · 1 comment
Open

Comments

@jonathanrobie
Copy link
Contributor

Sometimes we need annotations to refer to phrases or clauses. How can we create stable identifiers for structures larger than individual words?

@jonathanrobie
Copy link
Contributor Author

Randall's example:

A good example is:

 <Node Cat="S" Start="2" End="3" Rule="Np2S" Head="0" nodeId="010020070060091" Length="9">
  <Node Cat="np" Start="2" End="3" Rule="Np-Appos" Head="0" nodeId="010020070060090" Length="9">
    <Node Cat="np" Start="2" End="2" Rule="N2NP" Head="0" nodeId="010020070060041" Length="4">
      <Node n="010020070021" Cat="noun" Start="2" End="2" Length="4" morphId="010020070021" Unicode="יְהוָ֨ה" nodeId="010020070060040">
        <m n="010020070021" lang="H" after=" " lemma="3068" morph="Np" id="01pPp" pos="noun" type="proper">יְהוָ֨ה</m>
      </Node>
    </Node>
    <Node Cat="np" Start="3" End="3" Rule="N2NP" Head="0" nodeId="010020070100051" Length="5">
      <Node n="010020070031" Cat="noun" Start="3" End="3" Length="5" morphId="010020070031" Unicode="אֱלֹהִ֜ים" nodeId="010020070100050">
        <m n="010020070031" lang="H" after=" " lemma="430" morph="Ncmpa" id="01ieN" pos="noun" type="common" gender="masculine" number="plural" state="absolute">אֱלֹהִ֜ים</m>
      </Node>
    </Node>
  </Node>
</Node>

"Yahweh God" is the implied subject of the verb "he breathed" (n="010020070082"), but the SubjRef just uses 010020070021 for Yahweh because Yahweh is the head of the noun phrase. Head, Start, & End are zero-based. So, Start="2" End="3" tells us that words 3 & 4 in the verse are in the phrase. Head="0" tells us that the first word inside the phrase, "Yahweh," is the head of the noun phrase. We can safely take the whole noun phrase as the referent. You could theoretically use the nodeId. However, nodeId="010020070060090" is currently consonant based. If it were word based, it would be 010020070030020. For the Ref & SubjRef, theoretically prepositional phrases in the node with Cat="pp" Rule="PrepNp" also still have the head noun as the head. So, we could end up with systematically bringing in the prepositional phrase when only the noun phrase that is the object of the preposition is the referent. We will want to make the head of the prepositional phrase consistently the preposition to avoid this problem.


"Yahweh God" is the implied subject of the verb "he breathed"  (n="010020070082"), but the SubjRef just uses 010020070021 for Yahweh because Yahweh is the head of the noun phrase. Head, Start, & End are zero-based. So, Start="2" End="3" tells us that words 3 & 4 in the verse are in the phrase. Head="0" tells us that the first word  inside the phrase, "Yahweh,"  is the head of the noun phrase. We can safely take the whole noun phrase as the referent. You could theoretically use the nodeId. However, nodeId="010020070060090" is currently consonant based. If it were word based, it would be 010020070030020. For the Ref & SubjRef, theoretically prepositional phrases in the node with Cat="pp" Rule="PrepNp" also still have the head noun as the head. So, we could end up with systematically bringing in the prepositional phrase when only the noun phrase that is the object of the preposition is the referent. We will want to make the head of the prepositional phrase consistently the preposition to avoid this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant