Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/establish best practices for OCD-ID types across countries #170

Open
jdmgoogle opened this issue Sep 4, 2019 · 2 comments
Open

Comments

@jdmgoogle
Copy link
Contributor

This issue spun out of the discussion on PR #168.

Background links:
Creating new OCD-IDs
Division identifiers

The current OCD-ID documentation establishes one canonical identifier type -- country -- but otherwise gives latitude to within-country maintainers to define and enforce types appropriate for their jurisdiction. E.g., the first-level administrative type in the U.S. is a state, in Portugal it is a district, and in Germany it is a land.

The current situation gives a lot of flexibility and discretion to these within-country maintainer, but can place a significant burden on consumers of the identifiers to figure out what common types are across countries; e.g., is a district a first-level administrative division, or a sub-city legislative division?

Previous ad-hoc attempts to address this problem (e.g., PR #148) created identifiers which used the in-country types as aliases of identifiers that were more American in origin. E.g.,

Canonical: ocd-division/country:de/state:bb
Alias: ocd-division/country:de/land:bb

This was an attempt to balance the needs of publishers (using in-country types and terminology) and consumers (using types they had already seen). The discussion in PR #168 came to the conclusion that this was not desirable, or at least should go through a more thorough review before being implemented at scale. The options discussed were:

  1. Codify the ad-hoc approach.
  2. Reverse the current ad-hoc approach, in that the alias should be from the local term to the across-country term.
  3. Follow the practices of projects like GeoNames, etc., which use country-agnostic terms like adm1, adm2, etc, and not use the US-centric terms state and cd for a global specification.
  4. Add a new file to the spec, e.g. country-de-types.csv, with the columns local-type and standard-type, with rows like land,adm1 and wahlkreis,constituency.

Discuss. :)

@jpmckinney
Copy link
Member

jpmckinney commented Nov 3, 2019

Using the local terms will always make sense for local users.

A challenge only arises when a user (1) wants to use OCDIDs from multiple jurisdictions and (2) needs to know whether a division type in one jurisdiction corresponds to a division type in another jurisdiction. (@jdmgoogle Can you provide a specific use case?)

Now, the problem of deciding whether two divisions in two jurisdictions are of the same type is not a solved problem – by anyone, anywhere.

  • ISO 3166 has ISO 3166-2 for country subdivisions, but these are not organized according to a common, cross-jurisdiction hierarchy, and these cover many types of subdivisions without establishing where each is placed in the hierarchy.
  • NUTS has many cases that do not correspond to the truth on the ground; for example, Scotland, Wales and Northern Ireland are considered to be the same level or hierarchy as the regions of England.

GeoNames and others all likely have other issues as well, where, most likely, a maintainer had to decide what hierarchy to follow, which might not match reality.

In other words, trying to establish a worldwide crosswalk of division types is likely a fool's errand.

That said, I think it's fine for users to create aliases to the canonical IDs, where the aliases would attempt to organize the world into a single hierarchy of division types.

This repo, however, for the canonical IDs, should use division types that have a local meaning to better match reality.

@jpmckinney
Copy link
Member

jpmckinney commented Nov 17, 2019

Noting comments in #184 (comment) and below about a desire for a metadata file format for defining OCD types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants