Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocd for '2019 Polish parliamentary election' and existing parliament #168

Merged
merged 8 commits into from
Sep 9, 2019
Merged

ocd for '2019 Polish parliamentary election' and existing parliament #168

merged 8 commits into from
Sep 9, 2019

Conversation

sguenther85
Copy link
Contributor

ocds for '2019 Polish parliamentary election' and existing parliament

@sguenther85 sguenther85 changed the title Feature/country pl ocd for '2019 Polish parliamentary election' and existing parliament Aug 30, 2019
Copy link
Contributor

@jdmgoogle jdmgoogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise looks good. Thanks.

identifiers/country-pl/provinces.csv Outdated Show resolved Hide resolved
sguenther85 and others added 3 commits August 30, 2019 22:25
# /state instead of /province
again new line problems with compiler
Copy link
Contributor

@jdmgoogle jdmgoogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an odd line break issue with the merged file. Otherwise LGTM.

identifiers/country-pl.csv Outdated Show resolved Hide resolved
@jdmgoogle
Copy link
Contributor

Paging @jpmckinney or @djbridges

@jpmckinney
Copy link
Member

Since when is state the canonical first-level administrative division type? That seems very US-centric. 'Province' is more common according to https://en.wikipedia.org/wiki/List_of_administrative_divisions_by_country

Do we need a canonical first-level administrative division type? I think it's better if each country uses the terms that make the most sense in the local context.

If we need a way to determine the first-level administrative division type, then instead of forcing every country to use state, we can simply have a two column CSV with country code and division type…

Copy link
Member

@jpmckinney jpmckinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding to my above comment, I assume cd stands for 'congressional district', like in the US.

In Canada, the more generic ed is used for 'electoral district', which works since the lower house has constituencies, but the upper house doesn't.

I think Poland has constituencies at both levels, in which case ld (lower-house division/district) and ud (upper-house division/district) are one generic option that can be used.

@jdmgoogle
Copy link
Contributor

I think one of the big challenges here is trying to balance the ease-of-creation for publishers (who may be in-country) and standardizing for consumers (who may be ingesting across countries). The top-level administrative division may be different in each country, but a name may be used for different levels in different countries. E.g., a "district" may be a type of sub-city jurisdiction in one country, but a sub-national jurisdiction in another.

Here's the guidance from the OCD-ID documentation:
http://docs.opencivicdata.org/en/latest/proposals/0002.html

The set of types within each country should not grow unnecessarily. Each country maintainer should publish a list of types for easy reference. The addition of a new type must be justified.

My proposal that I've been trying to use in order to balance the needs of both publishers and consumers is to use country-specific types as an alias for the canonical, standardized types. This puts the onus on first-time publishers to map their country-specific types to a set of standardized types, and the onus on consumers to resolve the aliases to the canonical identifiers.

If there are other proposals out there around balancing the needs of both publishers and consumers I'm definitely open to figuring out how to improve the situation.

@jpmckinney
Copy link
Member

jpmckinney commented Aug 31, 2019

A few points, with a few proposals:

  1. We should first be clear that there are no "canonical, standardized types," besides country. state and cd are neither canonical nor standardized as part of the spec.
  2. The quoted guidance is about within each country. It's not about across countries, which is the issue here. There's no guidance for across countries, which was somewhat deliberate.
  3. If we want to have canonical, standardized types, I propose that we follow the good practices of other projects like GeoNames, etc., which use country-agnostic terms like adm1, adm2, etc. We should definitely not use the US-centric terms state and cd for a global specification, as this will hamper adoption both by publishers and users.
  4. If we want to use aliases as a means of having both local and across-country terms, then I propose that the alias should be from the local term to the across-country term, and not vice versa as it is now in some countries. This better preserves the sovereignty of each country community, which was a major thread running through the governance OCDEP. It also better mitigates against situations where earlier commits got the administrative division level wrong; changing aliases to correct the error is less bad than changing the primary identifiers.
  5. An alternative proposal, mentioned briefly earlier, is that instead of minting lots of aliases, we can add a new file to the spec, e.g. country-de-types.csv, with the columns local-type and standard-type, with rows like land,adm1 and wahlkreis,constituency (we can discuss what the standard types should be separately). Anyone who cares about getting all the first-level administrative divisions across countries can use those files to determine which types they should be looking for.
  6. Finally, we should not be making governance-level changes as part of PRs whose goal is simply to add identifiers. If we want to implement any of these proposals (or the practice you've been following) then the process is to do that through an OCDEP, so that people who care about such issues are aware. When it's done in a PR to add identifiers, many people won't notice that the governance and practices are changing under their feet.

@sguenther85
Copy link
Contributor Author

regardless of the discussion here, is there anything else that should be adjusted? I think at least for now we have a good starting version with /state "and" /province as alias, or not?

@jdmgoogle
Copy link
Contributor

@jpmckinney Thanks for the feedback. CIL.

We should first be clear that there are no "canonical, standardized types," besides country. state and cd are neither canonical nor standardized as part of the spec.

I was basing my comment on the OCD-ID documentation:
http://docs.opencivicdata.org/en/latest/proposals/0002.html

The type of boundary. (e.g. country, state, town, city, cd, sldl, sldu)

If the intention is that these are (essentially) US-only, then the documentation should be clarified to state that.

The quoted guidance is about within each country. It's not about across countries, which is the issue here. There's no guidance for across countries, which was somewhat deliberate.

Sorry, I should have been a bit clearer. The part I was trying to highlight was The addition of a new type must be justified to convey the sense that adding a new type should be something that involves more discussion. I (perhaps incorrectly) assumed that the bar to adding aliased types was lower. Combined with the previous point, having the types explicitly listed in the documentation as canonical types and only adding new types as aliases seemed to be in line with the current documentation/best practices.

If we want to have canonical, standardized types, I propose that we follow the good practices of other projects like GeoNames, etc., which use country-agnostic terms like adm1, adm2, etc. > We should definitely not use the US-centric terms state and cd for a global specification, as this will hamper adoption both by publishers and users.

That sounds like a reasonable long-term solution. There are still short-term needs (i.e., on the order of days/weeks), and I'd like to get those sorted out, even if the answer is "branch the repo, use the branch, and re-merge when the long-term solution is in place."

If we want to use aliases as a means of having both local and across-country terms, then I propose that the alias should be from the local term to the across-country term, and not vice versa as it is now in some countries. This better preserves the sovereignty of each country community, which was a major thread running through the governance OCDEP. It also better mitigates against situations where earlier commits got the administrative division level wrong; changing aliases to correct the error is less bad than changing the primary identifiers.

To clarify, does this mean that the solution would have:

Canonical: ocd-division/country:de/land:be
Alias: ocd-division/country:de/state:be

or vice versa?

An alternative proposal, mentioned briefly earlier, is that instead of minting lots of aliases, we can add a new file to the spec, e.g. country-de-types.csv, with the columns local-type and standard-type, with rows like land,adm1 and wahlkreis,constituency (we can discuss what the standard types should be separately). Anyone who cares about getting all the first-level administrative divisions across countries can use those files to determine which types they should be looking for.

In general the type mapping seems to be more direct and compact. As recent commits may have indicated, at this point our main concern is finding ways to get consistent structure for countries, first-level administrative divisions, and national-level legislative body districts.

Finally, we should not be making governance-level changes as part of PRs whose goal is simply to add identifiers. If we want to implement any of these proposals (or the practice you've been following) then the process is to do that through an OCDEP, so that people who care about such issues are aware. When it's done in a PR to add identifiers, many people won't notice that the governance and practices are changing under their feet.

These recent PRs were (as I saw them) an attempt to adhere to the existing documentation and structure in places where guidance was not explicit, not to write new rules. It appears that this wasn't the case.

I understand the desire/need to give country-level maintainers control over the creation and structure of types, but I'm worried that the current solution you've outlined places a significant burden on consumers of the identifiers to figure out what common types (e.g., first-level administrative divisions or legislative districts) are across countries. If changing that structure is going to require a governance change then I understand and support that process. We'll just need to do something on our end to support our short-term immediate needs (e.g., branching) while the long-term solution is put in place.

In the extremely short term, what do you want to do about this PR? :) Commit, or should we branch and then commit to our private repo?

@jpmckinney
Copy link
Member

Since there are very short term needs, let's merge as is but open a new issue to continue the discussion and propose answers to the open questions.

@jdmgoogle
Copy link
Contributor

Filed issue #170 to discuss long-term best practices.

@jpmckinney To clarify, does this mean that we have approval for the Poland OCD-IDs as-is, and future PRs to this repo will need to wait for the resolution of that issue?

@jpmckinney
Copy link
Member

Yes, approved. Further PRs can be approved on a case-by-case basis (eg balancing urgency against other considerations) while waiting for the new issue to be closed.

@sguenther85
Copy link
Contributor Author

Is there anything still do to here or can a second person now (after we created a own issue for the above discussion) approve this PR to use the ocds for the upcoming elections in 5 weeks? ;)

@jdmgoogle
Copy link
Contributor

Thanks. Merging.

@jdmgoogle jdmgoogle merged commit 992f68d into opencivicdata:master Sep 9, 2019
@sguenther85 sguenther85 deleted the feature/country-pl branch October 1, 2019 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants