-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add edition field to lwmdb #120
Comments
Tricky, and nice find... I think a deep-dive from the source mets/alto is a good starting point. Can you find an occurrence from HMD or LwM papers (i.e. public) and point us to the files that came from our partners, see how its been handled there |
@mcollardanuy I think you had an example of this from one of the collections? Could you share here? |
Hi @kmcdono2, no, I don't have an example: it was just an observation that we thought it was worth investigating at some point. So I think we need to understand whether this is really a problem (or could it be that morning and evening editions had different newspaper codes, for example?), and, if it is, whether this comes from the original data or from us, and how this is handled in the DB (i.e. are there duplicate item codes in the DB or were they removed?). |
Right then, @griff-rees has some ideas on how to test that hypothesis |
My approach is two fold:
|
QuickcComments:
|
@griff-rees - do we have an example of this? You give us a potential solution, but it's unclear if it's actually a problem we are seeing |
|
@griff-rees can we look for any |
I noticed this on Slack and thought I'd chime in - FMP don't digitise more than one edition per day, as it's just not worth it for them. Newspaper scholars would prefer that they did, of course, but I can see why they don't. |
If you do have any examples of multiple digitised editions for the same day, I can ask about how they're distinguished in the BNA / BL catalogue (also how they came to exist). |
Summary
Problem: existing combination of
publication_code
-issue_code
-item_code
is NOT unique.Why?
issue_code
is based on date, e.g. 18881204 (Dec 4, 1888).But, there can (sometimes) be multiple editions on the same day.
Currently there is no edition field in the newspaper db, which would solve this problem.
Solution:
Add
edition_code
to lwmdb at issue level.Then, adding this to
publication_code
,issue_code
, anditem_code
would ensure that we have human-understandable unique ids for all items.Not important to order
edition_code
at this stage, as it's both infrequent and there are a limited number of editions (1-3 max?).Actions
edition_code
in issue tableedition_code
to create unique ids for items in samples going forwardRelated Issues and Pull Requests
Updates
The text was updated successfully, but these errors were encountered: