-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to confirm sense distinctions #243
Comments
Can you give an example for (1)? For corpus tagging, I believe we do need to have special attention in the improvement of the glosses. making hard senses appear in the glosses can give opportunity for future annotation of these glosses. |
I believe we will also have to discuss if we want to keep all PWN original motivations and linguistics decisions or if we are willing to adopt different strategies. For instance, some of the systematic polysemy are expected and accepted as a consequence of the PWN structure in the original 5 papers. On the other hand, we now have experience from other wordnets such as the German and Polish. Maybe other relations and models are possible. German do not follow the cluster model for adjectives for example. |
For (1), a simple example would be that one sense of "bank" may collocate with "river", "stream", while another sense may collocate with another may collocate with "merchant", "statement", "account". You can then detect two distinct clusters using metrics such as PMI. I don't think we should fully diverge from PWN unless we have strong evidence that how PWN is performing it is poor (e.g., "satellite adjectives" are not a category that mixes well with the literature) or PWN doesn't have a fixed principal to follow (e.g., which I think is the case for systematic polysemy). |
Could we look at other WN projects for instances of polysemy that may have migrated to English WN? Perhaps we could also find relevant information using translation software, or dictionaries geared towards describing English as a foreign language. |
[Off topic] Learning from other wordnets does not necessarily mean we have to diverge from PWN. EuroWordNet's top ontology is an enhancement to the PWN semantic fields and is fully compatible with it. |
Hi @rwingerter55 , the problem is that dictionaries can differ. What dictionary will have priority? If we adopt the majority approach, we need a fixed list of dictionaries? Will we need to define which makes a dictionary a valid source? I am just thinking about how hard it can be to adopt this criterion in a large. |
I am writing a paper on this issue... so there may be some more concrete procedures for the project here |
Note the paper I refer to was published here: https://www.frontiersin.org/articles/10.3389/frai.2022.745626/full Not sure it solves the issues above though in the end (see next message) |
I have a proposal for making sense distinctions here: https://github.com/globalwordnet/english-wordnet/blob/issue-243/SYNSET_MERGING.md Merging and creating new synsetsThis document describes procedures in Open English WordNet for merging synsets and for Synsets that share a lemmaIn the case that we are considering merging two synsets that share a lemma or for the
Two synsets with different positions in the graph should not be merged. For example, An example of a merge based on these properties is given by Issue #911 If it is decided that no merge is necessary, we should normally update definitions Synsets that don't share a lemmaIn the case that the synsets don't share a lemma, we are also claiming that there
For example Issue #750 An example of 'self-serving' was found in the corpus
We substitute with the candidate merge lemma:
This does not seem to substantially change the meaning so we merged these synsets |
There are many subtle sense distinctions in the WordNet that could either represent sense distinctions not routinely made by English speakers, especially in the case of systematic polysemy or metonymy, where an object is referred to by a related term.
This issue is to capture ideas about how we can make a principled distinction here.
I have two suggestions:
Any other suggestions?
The text was updated successfully, but these errors were encountered: