Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on continued CV updates #997

Open
aradhakrishnanGFDL opened this issue Jan 11, 2021 · 10 comments
Open

Question on continued CV updates #997

aradhakrishnanGFDL opened this issue Jan 11, 2021 · 10 comments
Labels

Comments

@aradhakrishnanGFDL
Copy link

The ESGF data citations timeline is Jan 31st. Since CMIP is an ongoing/continued effort, I wanted to check and see if CMIP6 CV updates pertaining to a) new CMIP6 source registrations 2)adding new participating MIPs to existing GFDL models would continue past the Jan 31st timeline, to facilitate ESGF data publishing. Thanks!

@durack1
Copy link
Member

durack1 commented Jan 12, 2021

@aradhakrishnanGFDL hi, we have no intention of ceasing support on this repo so expect to keep getting tweaks merged.

I do note that there are discussions underway to initialize a CMIP6Plus phase, which will allow additional MIPs/experiments/models to contribute to non-CMIP6 activities, however that is in the discussion phase.

I will close this question, happy to answer any others, just ping this thread (and reopen if it's been a while)

@durack1 durack1 closed this as completed Jan 12, 2021
@durack1
Copy link
Member

durack1 commented Jan 12, 2021

@MartinaSt did you want to chime in here?

@matthew-mizielinski @taylor13 ping

@taylor13
Copy link
Collaborator

taylor13 commented Jan 13, 2021

@durack1 @aradhakrishnanGFDL @matthew-mizielinski @MartinaSt: This was discussed at yesterday's WIP meeting. Here are some edited notes:

To keep data references stable for the IPCC AR6, it will not be possible to change the authors and titles recorded by the data services after the WGI literature and data cut-off date 31 January 2021. This will cause issues for citations (a) for models registered after this “freeze” and (b) for existing models that have additional MIPs added to their activity_participation list in the CVs after the deadline.

Martina sent an announcement to modeling groups indicating that there is this deadline on the 31st January. They are asked to meet the deadline and provide the authors and titles for all the potential datasets they plan to create and eventually publish.

Apparently, "the impact on ESGF publication should be minimal; where citations do not exist there will be broken links on the ESGF CoG interface for affected datasets." I would like some clarification about this. On the ESGF CMIP6 CoG interface, when you click on "citation" (one of the options for each dataset listed), you get (for example):

Data Citation (Landing Page)
Identifier DOI: http://doi.org/10.22033/ESGF/CMIP6.4230
Creators: Dix, Martin; Bi, Doahua; Dobrohotoff, Peter et al.
Title: CSIRO-ARCCSS ACCESS-CM2 model output prepared for CMIP6 CMIP 1pctCO2
Publisher: Earth System Grid Federation
Publication Year: 2019
License: Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0)

My question is: What would you get if

  1. the model had not been registered in the CMIP6_CVs. (i.e., it wasn't registered by the January 31 deadline; before ESGF publication it would, of course, have to be subsequently registered, but would that propagate to the citation service?)
  2. the model was registered by the deadline but they had not indicated in the CMIP6_CVs that they were going to participate in the CMIP activity (which is responsible for the 1pctCO2 run).
  3. they had not provided the list of Creators, a Title, a Publication Year, and/or a License to the data services. (Are any of these automatically generated or harvested from the file metadata?

@taylor13 taylor13 reopened this Jan 13, 2021
@matthew-mizielinski
Copy link
Collaborator

Hi @taylor13

My guess based on the WIP discussion would be that in cases 1 and 2 we'd end up with a blank entry when you click on "citation" (@sashakames -- could you confirm that this is what you'd expect?). If I had to guess for case 3 then I would expect a few of the fields (DOI?) , but no more. I don't believe that any of the information in the citation is in the files in this exact form (e.g. the Publication Year might come from the version datestamp and the License could be extracted from the license attribute).

@MartinaSt, do we have any remaining examples of published data without corresponding citation entries that we could have a look at?

From my recollection of Tuesday's discussion I don't think anyone raised any wider implications of the freezing of the citations; there will likely be a in increasing number of inconsistencies as modelling groups either introduce new models or extend the MIPs they are involved in, but other than missing citation information no mention was made of any further impact.

@matthew-mizielinski
Copy link
Collaborator

matthew-mizielinski commented Jan 14, 2021

I think I can answer one of my questions above. I've been assisting some UK scientists (NERC) with the publication of PMIP data and the citation information for these looks like
image

I don't think the PMIP NERC HadGEM3-GC31-LL combination has been registered with the citation service yet (I'll be chasing this tomorrow).

@MartinaSt, is it reasonable to expect data without citations set up to look similar after January 31st?

@sashakames
Copy link

I recall examples of no information I suppose when the registration was completely missing. The information presented has been resolved via a url to the service that contains the full dataset DRS string so depends on what the citation database would return.

@taylor13
Copy link
Collaborator

So in the NERC example above, there is no doi assigned (but the url takes you to some citation information), and there are no authors listed, but otherwise the ESGF display is complete. @MartinaSt Besides the authors, is there other information missing in the citation database in this case?

@MartinaSt
Copy link

MartinaSt commented Jan 15, 2021

Sorry for the late reply, I am getting too many user requests.

Let me try to answer the questions starting at the top:

@taylor13

My question is: What would you get if

the model had not been registered in the CMIP6_CVs. (i.e., it wasn't registered by the January 31 deadline; before ESGF publication it would, of course, have to be subsequently registered, but would that propagate to the citation service?)

No. It does not make sense to propagate that, when the citation manager can no longer provide the author details for it.

the model was registered by the deadline but they had not indicated in the CMIP6_CVs that they were going to participate in the CMIP activity (which is responsible for the 1pctCO2 run).

CMIP is a special case, as I have always registered it for every model, so it is available in the citation database. Any other not-registered MIP is not available and will not be added (see answer to first question).

they had not provided the list of Creators, a Title, a Publication Year, and/or a License to the data services. (Are any of these automatically generated or harvested from the file metadata?

In general, data without authors do not receive a DOI, so providing the creators/authors for any not-yet published ESGF data is crucial or the only information which needs to be provided until 31 January 2021. Title will remain the default title, publication year is set by the citation service to the ESGF publication date at the time of DOI registration. License for new contributions is spot-checked by me.

@taylor13 @matthew-mizielinski The 'show citation' link will show a default error message or in other words it catches broken links.
Btw, I have raised the similar issue with ES-DOC for the furtherInfoUrl page, yesterday, We will follow up on with Mark .

@durack1 There are few data published in the ESGF with missing author lists, coverage has been between 98 and 99% over the last say half year. I am attaching this list of entries without authors to my reminder emails to the citation managers. First email 13 August, last this Monday, final one planned on 25 January 2021:
Example to look at (from 12:10 CET today): http://esgf-data.dkrz.de/search/cmip6-dkrz/?mip_era=CMIP6&activity_id=ScenarioMIP&institution_id=E3SM-Project&source_id=E3SM-1-1
(I just see, that @matthew-mizielinski has provided an example.)
This is the case where a link exists but no doi has been registered. The case above without registration and entry in the citation database leads to a broken link, which is caught for the 'show citation' access (Thanks @sashakames for mentioning this second case).

I hope I have picked up all question. Sorry again for the late reply, but the citation managers have many questions as well...

@durack1
Copy link
Member

durack1 commented Jan 15, 2021

@matthew-mizielinski thanks for the PR, merged.

@MartinaSt I have pinged the E3SM-1-1 folks and hopefully they can clean things up by the deadline. I'll make sure to get on top of the tweak requests before the 31st January 2021 cutoff - how about we leave this issue open until this time?

There is no obvious additional steps required to me, have I missed something?

@MartinaSt
Copy link

@durack1 Thanks for pinging E3SM-1-1 folks. A direct request is sometimes more effective than a general one. No, you have not missed anything (which I am aware of).

Thanks for your support, Paul and @matthew-mizielinski !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants