Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage annotations #1098

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Usage annotations #1098

wants to merge 10 commits into from

Conversation

1313ou
Copy link
Contributor

@1313ou 1313ou commented Sep 30, 2024

Following Francis Bond's remark that a number of examples are actually usage annotations, not examples, I propose to have a new section 'usage' that collects these (and also, later, fragments of definitions that similarly deal with usage)

I know that some of these usage notes can also - sometimes - be expressed by sense relations (notably domain_region/has_domain_region or exemplifies/is_exemplified_by), in which case the 'usage' section may turn out to be a transitional scratchpad.

But expressing usage with a relation requires identifying proper targets (Britain, UK, British English ...?) within a predefined set... that hasn't been defined yet: it's still open.

Beyond geographical locations, I found the following:
'archaic', 'dialectal', 'euphemistic', 'slang', 'literary', 'formal', 'legal', 'vulgar', 'frequent'
Should we link with the adjective, the noun ... ?

Other usage notes resist being expressed as relations:

  • ‘Scotch’ is in disfavor with Scottish people and is used primarily outside Scotland
  • ‘continual’ is often used interchangeably with ‘continuous’
  • in careful usage the noun `enormity´ is not used to express the idea of great size

These usage annotations may be ignored in XML output until the schema defines them or tags them with dc:something. However, some YAML tools will have to be updated, to ensure these usage notes are persistent across changes.

I'm not updating the python scripts as these are bound to disappear.

@jmccrae
Copy link
Member

jmccrae commented Oct 1, 2024

Thanks for this. Can you create an issue to discuss this, as it would require updating more than just the data files?

For reference here are the most common senses used for 'usage by exemplifies':

Sense Frequency
British spelling 1825
American spelling 1825
Australian spelling 1801
Canadian spelling 461
trade name 298
British English 143
American English 95
trademark 82
Canadian English 28
colloquialism 7
slang 7
archaism 7
plural 4
plural form 3
derogation 2
obscenity 1
Scots English 1
French 1
ethnic slur 1
slur 1

And these are the most common synsets:

Synset Freq
depreciation, derogation, disparagement 624
colloquialism 288
plural, plural form 246
cant, slang, jargon, patois, argot, lingo, vernacular 84
trademark 83
archaicism, archaism 69
obscenity, vulgarism, smut, dirty word, filth 46
Yiddish 45
combining form 39
intensifier, intensive 22
figure, image, trope, figure of speech 16
ethnic slur 15
comparative, comparative degree 13
Black English, Black English Vernacular, Black Vernacular, Black Vernacular English, AAVE, African American English, African American Vernacular English, Ebonics 9
euphemism 7
superlative 7
formality 6
brand, brand name, trade name, marque 6
portmanteau, blend, portmanteau word 6
French, French language 5
initialism, acronym 5
idiom, phrase, idiomatic expression, set phrase, phrasal idiom 3
irony 3
idiom, dialect, accent 3
regionalism 2
synecdoche 2
humor, humour, wittiness, wit, witticism 2
street name 1
racial discrimination, racialism, racism 1
fishnet stockings 1
crimper, crimping pliers 1
salopettes, ski pants 1
Lysoform 1
Cebion 1
entryphone 1
abbreviation 1

@jmccrae jmccrae added this to the 2025 Release milestone Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants