Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look into why we cannot upload the RDF of "The Session" database. #2

Open
candlecao opened this issue Aug 5, 2024 · 6 comments
Open
Assignees
Labels

Comments

@candlecao
Copy link
Contributor

image
@candlecao candlecao self-assigned this Aug 5, 2024
@candlecao candlecao added the priority: high high priority label Aug 5, 2024
@candlecao
Copy link
Contributor Author

Hi Yueqiao, probably the reason is: the values of some property have both objective and the literals(like strings):

<https://thesession.org/members/1/sets/12750> a ns3:Q36161 ;
    ns1:P2308 <https://www.wikidata.org/wiki/Q1079270>,
        "jig" ;
    ns1:P2561 "Johnny Boyle's",
        "King Of The Pipers" ;
    ns1:P3440 <https://www.wikidata.org/wiki/Q50353378>,
        "6/8" ;
    ns1:P554 "Jeremy" ;
    ns1:P826 <https://www.wikidata.org/wiki/Q5728362>,
        "A dorian",
        "G major" ;
    ns2:setting_id "51",
        "54" ;
    ns2:tunes <https://thesession.org/tunes/51>,
        <https://thesession.org/tunes/54> .

@candlecao
Copy link
Contributor Author

Regrettably, I tried and found it is not where the problem is.

@candlecao
Copy link
Contributor Author

candlecao commented Aug 6, 2024

However, even if there was the error tip shown above, the http://sample/thesession/reconciled GRAPH still was partially uploaded. I query by select distinct ?p where {?s ?p ?o} and got the result:

p
--
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.wikidata.org/prop/direct/P136
http://www.wikidata.org/prop/direct/P2561
http://www.wikidata.org/prop/direct/P826
http://www.wikidata.org/prop/direct/P1114
http://www.wikidata.org/prop/direct/P1625
http://www.wikidata.org/prop/direct/P51
http://www.wikidata.org/prop/direct/P5489
http://www.wikidata.org/prop/direct/P658
http://www.wikidata.org/prop/direct/P131
http://www.wikidata.org/prop/direct/P17
http://www.wikidata.org/prop/direct/P2308
http://www.wikidata.org/prop/direct/P276
http://www.wikidata.org/prop/direct/P3440
http://www.wikidata.org/prop/direct/P554
http://www.wikidata.org/prop/direct/P580
http://www.wikidata.org/prop/direct/P582
http://www.wikidata.org/prop/direct/P6260
http://www.wikidata.org/prop/direct/P6261
http://www.wikidata.org/prop/direct/P6375
http://www.wikidata.org/prop/direct/P793
https://thesession.org/setting_id
https://thesession.org/tunes
http://www.wikidata.org/entity/Q66382988
http://www.wikidata.org/prop/direct/P3030
http://www.wikidata.org/prop/direct/P585
http://www.wikidata.org/prop/direct/P742

Then we can check which property is missing.

@candlecao
Copy link
Contributor Author

I tried to open the regenerated Turtle file today, and got this prompt:
image
I decided to choose Ignore this time.

@candlecao
Copy link
Contributor Author

candlecao commented Aug 8, 2024

One of the approaches to identify the error is to convert RDF for each entity separately and then import the RDF files one by one to Virtuoso to pinpoint where the error occurs.

@candlecao
Copy link
Contributor Author

Aug 8 2024:
We found out the error: Something was wrong with the http://www.wikidata.org/prop/direct/P3030 property of the tunes entity.

To solve the problem, we can write a *.sql script and put the triples of <any tune> <http://www.wikidata.org/prop/direct/P3030> """value of abc""" in it, such as:

SPARQL
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
INSERT into GRAPH <http://sample/thesession/reconciled/oneByOne>
{
  <https://thesession.org/tunes/11085> wdt:P3030 "literal content".
};

then put the *.sql file within the bin folder of Virtuoso before executing it using load *.sql; on terminal.

But there is still a limitation of the amount of content in the script, so we can divide the data into several parts and upload one by one.

Presumably via this method we can upload the data ultimately.

@candlecao candlecao added medium and removed priority: high high priority labels Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants