You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Case 1: New data in the database
Case 2: New data in Wikidata, only for static databases
...
4. Already manually reconciled CSV_X (has venue_wiki)
5. New dump of the database: CSV_Y (no venure_wiki but more rows)
How do we update CSV_X?
Replace (with venue_wiki columns) all rows that are not the same
Add any new rows
Use OpenRefine?
How do we check whether new rows can now be reconciled with Wikidata (because there are new entries in Wikidata)?
In OpenRefine, we only try to reconcile the entries in that has blank in venue_wiki column._
Problem: since almost all important columns (such as IDs) are transformed into an URL, there is no way to compare the old and new CSV unless we go into the details, which makes it not very automatic.
In our previous discussion, the approach is to compare & merge the new updated raw CSV to the old reconciled CSV, then reconcile only the updated part of the merged CSV and leave the old data untouched. However, since my reconciliation process modifies the raw CSV completely, pandas.concat() would consider the same row from the raw CSV and the reconciled CSV as two different rows.
For example, in reconciled CSV:
recording_id,artist,recording,recording_wiki,track,number,tune,tune_id
https://thesession.org/recordings/3720,1651,Cast A Bell,,1,1,Kettledrum,https://thesession.org/tunes/14408
in raw CSV:
id,artist,recording,track,number,tune,tune_id
3720,1651,"Cast A Bell",1,1,Kettledrum,14408
and if we update the raw CSV:
id,artist,recording,track,number,tune,tune_id
3720,1651,"Cast A Bell",1,1,Kettledrum,14408
3720,1651,"Cast A Bell",2,1,"Maiden Lane",13727
then we compare & merge, we will get
recording_id,artist,recording,recording_wiki,track,number,tune,tune_id,id
https://thesession.org/recordings/3720,1651,Cast A Bell,,1,1,Kettledrum,https://thesession.org/tunes/14408,
,1651,Cast A Bell,,1,1,Kettledrum,14408,3720
,1651,Cast A Bell,,2,1,Maiden Lane,13727,3720
The text was updated successfully, but these errors were encountered: