Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ca: Create new files for current, abolished and renamed OCD-IDs #324

Closed
wants to merge 1 commit into from

Conversation

jpmckinney
Copy link
Member

@jpmckinney jpmckinney commented Nov 4, 2022

This PR implements a naming policy co-created with @evannjw in #323.

In brief:

  • All current divisions are given an ID, based on their name.
  • All divisions that are abolished according to the Library of Parliament are given an ID, based on their name.
  • If multiple current/abolished divisions have the same ID, later divisions are suffixed with the validFrom year.
  • All divisions that were renamed according to the Library of Parliament are aliases (sameAs) to another ID.
  • A province/territory:id pair is prefixed to the ed:id pair, since two provinces can contain the same district name.
  • The most recent ID is the "canonical" ID. Previous renames are aliased to it. This ensures that renamings to respect language, culture, etc. (e.g. Three Rivers -> Trois Rivières) are reflected in the dataset, without giving primacy to the old name that might be colonial, etc.

To complete the PR:

  • Add missing sameAs aliases (about 60 left to resolve either via code or manually)
  • Do the TODOs in the code (also remove validFrom,validThrough from aliases CSV)
  • Add sameAs aliases between the existing OCD-IDs and these new ones

@evannjw
Copy link
Contributor

evannjw commented Nov 7, 2022

Looks great, we should also rename /identifiers/country-ca/ca_federal_electoral_districts.csv to clarify what that file contains and/or update README with new naming policy. Maybe something like ca_federal_electoral_districts_2003.csv and update references to it in /scripts/country-ca/ca_federal_electoral_districts.rb and /scripts/country-ca/run-all.sh.

Are the TODOs something you'd like help with @jpmckinney?

@jpmckinney
Copy link
Member Author

Sounds good. Yes, I wanted to get agreement on what I had so far before completing the TODOs.

We can rename the old files (those IDs will be aliases to the new ones), and document the naming policy in the readme.

The Ruby scripts for those IDs would be removed, as the Python script is now responsible for federal IDs.

@jpmckinney
Copy link
Member Author

I can work on the TODOs in the Python script, if you can help with aliasing the current federal IDs (ca_federal_electoral_districts-2013.csv should be easy, ca_federal_electoral_districts.csv probably requires a lookup that accounts for name, validFrom, validThrough).

@evannjw
Copy link
Contributor

evannjw commented Nov 7, 2022

Sounds good, I'll also update the README and rename/clean up old files.

@evannjw
Copy link
Contributor

evannjw commented Nov 8, 2022

I don't seem to have access to push to this branch/repo. Here is a file containing aliases of current federal IDs.
aliases.csv

@jpmckinney
Copy link
Member Author

@evannjw I don't have much time this week for this PR, so if you can help with the other items, please have a look.

@evannjw
Copy link
Contributor

evannjw commented Nov 10, 2022

Sure I can wrap up the remaining TODOs here

@evannjw
Copy link
Contributor

evannjw commented Nov 17, 2022

Hi sorry, haven't actually had time to work on this, should be able to get to this tomorrow/early next week

@jdmgoogle
Copy link
Contributor

@jpmckinney All of this LGTM. Thanks to you and @evannjw for putting this together. I support this approach.

@jpmckinney jpmckinney mentioned this pull request Sep 20, 2023
@jpmckinney
Copy link
Member Author

Replaced by #327

@jpmckinney jpmckinney closed this Jan 25, 2024
@jpmckinney jpmckinney added the CA label Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

3 participants