-
Notifications
You must be signed in to change notification settings - Fork 4
fix macroareas #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix macroareas #88
Conversation
|
@xrotwang @LinguList Here is the PR. However, I am not 100% happy. It does correctly fix the problem for missing or wrong macroareas. However, in other cases, it removes the macroarea, because the associated glottocode is on a family-level, not a language-level, and does not have a macroarea associated in the glottolog-cldf. How do we handle this? |
|
The fix would be easy, but of course risk to yield problems later on: "Macroarea": macmacroarea[0].name if macroarea else language.macroarea,How many languages suffer from empty macroareas now? This is also important. We exclude so far:
Should we extend this? |
|
Here's a quick hack: macroarea = languoids[language.glottocode].macroareas
if not macroarea:
macs = [l.macroareas[0].name for l in languoids[language.glottocode].iter_descendants() if l.macroareas]
macroarea = sorted(set(macs), key=lambda x: macs.count(x), reverse=True)[0]
else:
macroarea = macroareas[0].name |
|
Alternative: determine valid macroareas after having defined all languoids: languoids = self.glottolog.cached_languoids
valid_macs = set([l.macroareas[0].name for l in languoids if l.macroareas])In the function "Macroarea": language.macroarea if language.macroarea in valid_macs else languoids[language.glottocode].macroareas[0],But this may still yield an error, if a maroarea is invalid and the glottocode represents a language that has no macroarea. |
|
Given that we only have 6 valid macro-areas, we might as well hardcode them to avoid the error, and only replace the given option if the current value deviates from the set of valid areas. |
Well, at least we identify the problem then and can solve it upstream, right? So I think this is a fine solution. If we find an error, we can fix it |
|
Yes. We could in theory -- but that would lead too far -- even identify macroarea by geolocation (would be more exact). |
|
Ah, nice to see that the discussion contributes to Glottolog :-) |
|
I am now rerunning the code with the solution proposed (the one that iterates). But we have only 42 cases anyway, so I hope that we can afterwards really make the release 2.1 then. |
|
@FredericBlum, I'll make another branch now, where I provide the fix. |
No description provided.