Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add citation/references for datasets #3

Open
faroit opened this issue Jan 29, 2019 · 10 comments
Open

add citation/references for datasets #3

faroit opened this issue Jan 29, 2019 · 10 comments

Comments

@faroit
Copy link
Contributor

faroit commented Jan 29, 2019

Wow, this is great work!

Since we have the problem over there at SiSEC that folks constantly cite the wrong paper for one of our datasets, it came to my mind to help the community to make things more clear.

What about adding one citation field to each dataset. Possibly by using using a DOI or putting a CSL string. This could be a zenodo/a paper DOI or just a website when nothing applies. Tell me what you think? I can help out if you like the idea...

@ejhumphrey
Copy link
Collaborator

ooh, missed this issue! I think that's a great idea!!! might be a heavy lift, how would you propose tackling it?

@ejhumphrey
Copy link
Collaborator

by the way it's all thanks to Alex Lerch!! He's been doing the hard work with this for so long 😄

@faroit
Copy link
Contributor Author

faroit commented Mar 8, 2019

ooh, missed this issue! I think that's a great idea!!! might be a heavy lift, how would you propose tackling it?

what about just starting with a DOI field for now to keep things homogenous? I can start adding a few through pull requests...

@faroit
Copy link
Contributor Author

faroit commented Mar 8, 2019

by the way it's all thanks to Alex Lerch!! He's been doing the hard work with this for so long 😄

yes I know his table for quite some time.
@alexanderlerch do you actually parse this json on your site?

@alexanderlerch
Copy link
Collaborator

@faroit @ejhumphrey
I am all for adding additional info to this list. Happy to take the lead on this, although I am not entirely sure what the best info would be to include. I would probably handle this with a request to the ISMIR list to send in paper links (or better DOIs, although the majority might not have DOIs).

BTW, for now I am maintaining only my original list at https://gist.github.com/alexanderlerch/e3516bffc08ea77b429c419051ab793a#file-data-sets-md because I don't have write access to the repository...

@faroit
Copy link
Contributor Author

faroit commented Mar 8, 2019

BTW, for now I am maintaining only my original list at https://gist.github.com/alexanderlerch/e3516bffc08ea77b429c419051ab793a#file-data-sets-md because I don't have write access to the repository...

@ejhumphrey maybe you can add Alexander to the repo?

@faroit
Copy link
Contributor Author

faroit commented Mar 8, 2019

I am all for adding additional info to this list. Happy to take the lead on this, although I am not entirely sure what the best info would be to include. I would probably handle this with a request to the ISMIR list to send in paper links (or better DOIs, although the majority might not have DOIs).

@alexanderlerch Great, lets fill in the fields later via PRs.
I think we should avoid putting the bibtex string as a field as that makes things more complicated to parse properly. So maybe DOI and paper_url?

@ejhumphrey
Copy link
Collaborator

hiya @alexanderlerch @faroit,

a few things from me.

access: given your past contributions, you've both been granted write access on this repository. Generally speaking though, I hope that not having write permissions won't dissuade folks from contributing via PRs ... if this is a high barrier, that'll be good to know.

outputs: Ideally, this repository would have a "build" step, per #5, that automatically produces updated HTML tables (as JSON) and MD files when the YAML is changed (and verified, per #1). In this world, only the YAML would be modified directly, and everything else is automatic, and life is easy and great. 😀

Canonical Ref / DOIs: Two thoughts here.
1 - all ISMIR papers have DOIs now, which can be found in this repository. Not all of these datasets come from ISMIR though, and I don't know how much coverage we'll get.
2 - ISMIR metadata projects, in the past, have gotten quite far by senior faculty & advisors assigning tasks like this to new students as a mechanism for familiarizing themselves with the literature. I think it would probably be more effective to reach out to specific faculty who can recruit students to do this. I've found open-ended asks to large groups of people to have limited effectiveness.

@alexanderlerch
Copy link
Collaborator

Just to verify how many references we will get, I went through the datasets starting with A through D and looked up the references (only added DOI links or permalinks). I added these as links to an additional 'reference' entry - let me know if that is how we can do it/what you think: https://github.com/ismir/mir-datasets/blob/lerch_updates/mir-datasets.yaml
Obviously this is work in progress, I just wanted your feedback.

@ejhumphrey I noticed that there are no DOIs yet for the 2018 ISMIR papers - how should we handle that in the datsets list?

@ejhumphrey
Copy link
Collaborator

cool, thanks for this! Looks good to me, I think ... @faroit, any thoughts here? Only thing that comes to mind is that the tests would need to be updated to reflect the schema change, as well as the readme.

The 2018 papers do have DOIs, but I had forgotten that the PR is still outstanding (from November 😞). I'll see if I can get that over the goalline in the next 30'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants