Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: Add properties to wikidata items directly #9

Open
jneubert opened this issue Dec 22, 2016 · 8 comments
Open

Feature suggestion: Add properties to wikidata items directly #9

jneubert opened this issue Dec 22, 2016 · 8 comments

Comments

@jneubert
Copy link
Collaborator

Perhaps the following feature is planned anyway, then just take it as a confirmation that there is interest :)

Sometimes, authority properties for wikidata can be derived from external sources. (E.g., the Repec-ShortID property can be extracted from infoboxes in the Englisch wikipedia via dbpedia.)

I've not found a program/bot which can process a file for adding these properties. The error checking logic is very similar to wdmapper, so it could make sense to extend the tool for this purpose, too.

Example input (in beacon format):

$ head -20 /opt/repec_ras/var/ras/latest/beacon/dbpedia_repec_wd.txt
#DESCRIPTION: RePEc-ShortID properties for wikidata from en.wikipedia via dbpedia
#CREATOR: ZBW - Leibniz Information Centre for Economics
#CONTACT: [email protected]
#HOMEPAGE: http://zbw.eu
#TIMESTAMP: 2016-12-22
#PREFIX: http://www.wikidata.org/entity/
#ANNOTATION: RePEC author name
#TARGET: https://authors.repec.org/pro/
#WDTARGETPROPERTY: P2428

Q353915|David D. Friedman|pfr16
Q312561|James Heckman|phe22
Q192592|Kenneth Arrow|par7
Q132489|Amartya Sen|pse23
Q107264|Robert Lucas|plu15
Q295647|Myron Scholes|psc29
Q219721|Robert Mundell|pmu18
Q157268|Robert Solow|pso18
Q295717|Vernon L. Smith|psm12
Q222541|George Akerlof|pak7
@nichtich
Copy link
Member

This is implemented in the dev branch for command get - try wdmapper get P2428. Command check only works with two properties by now and command add (based on check) has not been started yet.

By the way: how would you call the two kinds of mappings (with two properties or with Wikidata item id and one property)? One-way vs. two-way mappings?

@jneubert
Copy link
Collaborator Author

In my thinking about the tool, I would define it's main purpose as adding authority properties (which normally means that these properties are of datatype external-id, and should represent 1:1 relationship).

The item to which the property should be added can be defined in two ways:

  • directly, by supplying an item identifier
  • indirectly, by supplying the value of a property which also represents a 1:1 relationship and therefore identifies an item unambiguously

I would prefer this terminology over one-way vs. two-way, because the latter could be intermingeled with reversing the input for an indirect mapping (as outlined in #13).

@jneubert jneubert reopened this Dec 28, 2016
@jneubert
Copy link
Collaborator Author

wdmapper get P2428 (and wdmapper get P2428 P227) work nicely!

@nichtich
Copy link
Member

This should also work from the dev branch:

$ wdmapper.py get P2428 P227 -t csv > mappings.csv
# modify mappings.csv
$ wdmapper.py diff P2428 P227 -i mappings.csv
$ wdmapper.py check P2428 P227 -i mappings.csv

@nichtich
Copy link
Member

With the terminology you suggested, the tool can be used

to manage direct mappings:

wdmapper $command $property

to manage indirect mappings:

wdmapper $command $source $target

The names "source" and "target" come from BEACON.

@jneubert
Copy link
Collaborator Author

Makes sense to me, too!

@jneubert
Copy link
Collaborator Author

I'd suggest that the beacon output of the get command not only include the source/target URI stubs, but also explicitly the WDSOURCEPROPERTY/WDTARGETPROPERTY header statements. (They could be obtained somehow from the URI stub, but subtle differences (e.g. implied {id}, trailing slashes and the like) could make that more brittle.

@nichtich
Copy link
Member

Metadata fields SOUCEPROPERTY and TARGETPROPERTY are used internally in 0.0.9 but with the raw property id such as "P227". I'd change this to the full property URI such as "http://www.wikidata.org/entity/P227" and make them public in the next release.

@nichtich nichtich added this to the 0.1.0 milestone Apr 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants