Skip to content

Commit

Permalink
Merge branch 'main' of github.com:pkiraly/qa-catalogue
Browse files Browse the repository at this point in the history
  • Loading branch information
pkiraly committed Mar 12, 2024
2 parents 53e34fa + 0c1f28b commit 74c0a51
Show file tree
Hide file tree
Showing 165 changed files with 8,000 additions and 50,833 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/avram.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Validate Avram Schemas

on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main, develop ]

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm install -g [email protected] ajv ajv-formats # checking Avram 0.9.6
- name: Validate Avram Schema files
run: |
avram -s src/main/resources/pica-schema.json
avram -s src/main/resources/pica/avram-k10plus-title.json
avram -s src/main/resources/unimarc/avram-unimarc.json
avram -s src/test/resources/pica/schema/pica-schema-extra.json
avram -s src/test/resources/pica/schema/pica-schema.json
avram -s src/test/resources/unimarc/avram-unimarc.json
2 changes: 1 addition & 1 deletion .github/workflows/maven.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
fetch-depth: 0 # Shallow clones should be disabled for a better relevancy of analysis

- name: Set up JDK 17
uses: actions/setup-java@v3
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'adopt'
Expand Down
356 changes: 197 additions & 159 deletions README.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions catalogues/K10plus.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# K10plus-Verbundkatalog (MARC records)
# https://opac.k10plus.de

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/bayern.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Verbundkatalog B3Kat des Bibliotheksverbundes Bayern (BVB) und des Kooperativen Bibliotheksverbundes Berlin-Brandenburg (KOBV)
# https://www.bib-bvb.de/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/bl.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# British Library
# https://www.bl.uk/

. ./setdir.sh

Expand Down
12 changes: 12 additions & 0 deletions catalogues/bnf.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash
# Bibliothèque nationale de France
# https://www.bnf.fr

. ./setdir.sh

NAME=bnf
MARC_DIR=${BASE_INPUT_DIR}/bnf
TYPE_PARAMS="--emptyLargeCollectors --schemaType UNIMARC"
MASK=P174_*.UTF8

. ./common-script
4 changes: 3 additions & 1 deletion catalogues/bnpl.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
#!/usr/bin/env bash
# Biblioteka Narodowa (Polish National Library)
# https://bn.org.pl/

. ./setdir.sh

NAME=bnpl
# TYPE_PARAMS="--marcVersion GENT"
TYPE_PARAMS=" --emptyLargeCollectors"
TYPE_PARAMS=" --emptyLargeCollectors --indexWithTokenizedField"
MARC_DIR=${BASE_INPUT_DIR}/bnpl
MASK=bibs-all.marc.gz

Expand Down
9 changes: 8 additions & 1 deletion catalogues/bnpt.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
#!/usr/bin/env bash
# Biblioteca Nacional de Portugal (BNP)
# web: https://www.bnportugal.gov.pt/
# data: https://opendata.bnportugal.gov.pt/downloads.htm

. ./setdir.sh

NAME=bnpt
# TYPE_PARAMS="--marcVersion GENT"
TYPE_PARAMS=" --marcxml --emptyLargeCollectors"
TYPE_PARAMS="--schemaType UNIMARC --marcxml --emptyLargeCollectors"
TYPE_PARAMS="$TYPE_PARAMS --solrForScoresUrl http://localhost:8983/solr/bnpt_validation"
TYPE_PARAMS="$TYPE_PARAMS --indexWithTokenizedField"
TYPE_PARAMS="$TYPE_PARAMS --indexFieldCounts"

MARC_DIR=${BASE_INPUT_DIR}/bnpt
MASK=bibliographics_*.xml

Expand Down
2 changes: 1 addition & 1 deletion catalogues/bnr.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env bash

# Biblioteca Nationala a Romaniei
# https://www.bibnat.ro/

. ./setdir.sh
NAME=bnr
Expand Down
2 changes: 2 additions & 0 deletions catalogues/cerl.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# The Heritage of the Printed Book Database
# https://www.cerl.org/resources/hpb/main/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/clb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Česká literární bibliografie
# https://clb.ucl.cas.cz/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/ddb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Deutsche Digitale Bibliothek
# https://www.deutsche-digitale-bibliothek.de/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/dnb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Deutsche Nationalbibliothek
# https://www.dnb.de/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/firenze.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Biblioteca Nazionale Centrale di Firenze
# https://www.bncf.firenze.sbn.it/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/gbv.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Verbundzentrale des Gemeinsamen Bibliotheksverbundes
# http://www.gbv.de/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/gent.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Universiteitsbibliotheek Gent
# https://lib.ugent.be/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/harvard.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Harvard Library
# https://library.harvard.edu/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/k10plus_pica.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# K10plus-Verbundkatalog (PICA records)
# https://opac.k10plus.de

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/k10plus_pica_grouped.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# K10plus-Verbundkatalog (PICA records with library holdings)
# https://opac.k10plus.de

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/kb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# KB (Koninklijke Bibliotheek van Nederland)
# https://www.kb.nl

. ./setdir.sh
NAME=kb
Expand Down
2 changes: 2 additions & 0 deletions catalogues/kbr.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# KBR (Koninklijke Bibliotheek van België/Bibliothèque royale de Belgique)
# https://www.kbr.be/

. ./setdir.sh

Expand Down
1 change: 1 addition & 0 deletions catalogues/knihoveda.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/usr/bin/env bash
# Knihoveda
# https://www.zb.uzh.ch/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/libris.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Libris, the Swedish national union catalogue
# https://libris.kb.se/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/lnb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Latvijas Nacionālā bibliotēka
# https://lnb.lv/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/loc.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Library of Congress
# https://catalog.loc.gov/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/mek.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Magyar Elektronikus Könyvtár
# https://mek.oszk.hu/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/mokka.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Magyar Országos Közös Katalógus
# http://mokka.hu/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/mtak.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# A Magyar Tudományos Akadémia Könyvtára
# https://mtak.hu/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/nfi.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Kansallis Kirjasto/National Biblioteket (The National Library of Finnland, Fennica catalogue)
# https://www.kansalliskirjasto.fi/en

. ./setdir.sh

Expand Down
1 change: 1 addition & 0 deletions catalogues/nkp.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/usr/bin/env bash
# Národní knihovna České republiky - National Library of the Czech Republic
# https://nkp.cz/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/nli.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# National Library of Israel
# https://www.nli.org.il/en

. ./setdir.sh

Expand Down
3 changes: 3 additions & 0 deletions catalogues/nls.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
#!/usr/bin/env bash
# The National Bibliography of Scotland (from the National Library of Scotland)
# https://www.nls.uk/

. ./setdir.sh

NAME=nls
MARC_DIR=${BASE_INPUT_DIR}/nls
TYPE_PARAMS="--marcxml --emptyLargeCollectors --indexWithTokenizedField --indexFieldCounts --solrForScoresUrl http://localhost:8983/solr/nls_validation"
Expand Down
2 changes: 2 additions & 0 deletions catalogues/onb.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Österreichische Nationalbibliothek (Austrian National Library)
# https://www.onb.ac.at/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/oszk.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Országos Széchényi Könyvtár (Hungarian National Library)
# https://www.oszk.hu/

. ./setdir.sh
NAME=oszk
Expand Down
2 changes: 2 additions & 0 deletions catalogues/szte.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# A Szegedi Tudományegyetem Klebelsberg Kuno Könyvtára
# http://www.ek.szte.hu/

. ./setdir.sh

Expand Down
2 changes: 2 additions & 0 deletions catalogues/uva.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Bibliotheek Universiteit van Amsterdam/Hogeschool van Amsterdam
# https://uba.uva.nl/home

. ./setdir.sh

Expand Down
4 changes: 3 additions & 1 deletion catalogues/yale.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
#!/usr/bin/env bash
# Yale Library
# https://library.yale.edu/

. ./setdir.sh

NAME=yale
MARC_DIR=${BASE_INPUT_DIR}/yale/2023-11-05
MARC_DIR=${BASE_INPUT_DIR}/yale/2024-02-11
TYPE_PARAMS="--emptyLargeCollectors --indexWithTokenizedField --indexFieldCounts"
MASK=bib_*.mrc.gz

Expand Down
1 change: 1 addition & 0 deletions catalogues/zb.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/usr/bin/env bash
# Zentralbibliothek Zürich
# https://www.zb.uzh.ch/

. ./setdir.sh

Expand Down
2 changes: 1 addition & 1 deletion common-script
Original file line number Diff line number Diff line change
Expand Up @@ -628,7 +628,7 @@ for task in ${tasks//,/ }; do
validate-sqlite) do_validate_sqlite ;;
prepare-solr) do_prepare_solr ;;
index) do_index ;;
postprocess_solr) do_postprocess_solr ;;
postprocess-solr) do_postprocess_solr ;;
completeness) do_completeness ; do_completeness_sqlite ;;
completeness-sqlite) do_completeness_sqlite ;;
classifications) do_classifications ;;
Expand Down
6 changes: 4 additions & 2 deletions index
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ options:
-C, --indexWithTokenizedField index data elements as tokenized field as well
-D, --commitAt <arg> commit index after this number of records
-E, --indexFieldCounts index the count of field instances
-F, --fieldPrefix <arg> field prefix
-Z, --core <arg> The index name (core)
-Y, --file-path <arg> File path
-X, --file-mask <arg> File mask
Expand All @@ -73,8 +74,8 @@ if [ $# -eq 0 ]; then
show_usage
fi

SHORT_OPTIONS="m:hnl:o:i:d:qabpxyt:rz:v:f:s:g:1:2:u:j:w:k:c:e:3:4:S:AT:BCD:EZ:Y:X:WVU"
LONG_OPTIONS="marcVersion:,help,nolog,limit:,offset:,id:,defaultRecordType:,fixAlephseq,fixAlma,fixKbr,alephseq,marcxml,lineSeparated,outputDir:,trimId,ignorableFields:,ignorableRecords:,marcFormat:,dataSource:,defaultEncoding:,alephseqLineType:,picaIdField:,picaSubfieldSeparator:,picaSchemaFile:,schemaType:,picaRecordType:,allowableRecords:,groupBy:,groupListFile:,solrForScoresUrl:,solrUrl:,doCommit,solrFieldType:,useEmbedded,indexWithTokenizedField,commitAt:,indexFieldCounts,core:,file-path:,file-mask:,purge,status,no-delete"
SHORT_OPTIONS="m:hnl:o:i:d:qabpxyt:rz:v:f:s:g:1:2:u:j:w:k:c:e:3:4:S:AT:BCD:EF:Z:Y:X:WVU"
LONG_OPTIONS="marcVersion:,help,nolog,limit:,offset:,id:,defaultRecordType:,fixAlephseq,fixAlma,fixKbr,alephseq,marcxml,lineSeparated,outputDir:,trimId,ignorableFields:,ignorableRecords:,marcFormat:,dataSource:,defaultEncoding:,alephseqLineType:,picaIdField:,picaSubfieldSeparator:,picaSchemaFile:,schemaType:,picaRecordType:,allowableRecords:,groupBy:,groupListFile:,solrForScoresUrl:,solrUrl:,doCommit,solrFieldType:,useEmbedded,indexWithTokenizedField,commitAt:,indexFieldCounts,fieldPrefix:,core:,file-path:,file-mask:,purge,status,no-delete"

GETOPT=$(getopt \
-o ${SHORT_OPTIONS} \
Expand Down Expand Up @@ -131,6 +132,7 @@ while true ; do
-C|--indexWithTokenizedField) PARAMS="$PARAMS --indexWithTokenizedField" ; shift ;;
-D|--commitAt) PARAMS="$PARAMS --commitAt $2" ; shift 2 ;;
-E|--indexFieldCounts) PARAMS="$PARAMS --indexFieldCounts" ; shift ;;
-F|--fieldPrefix) PARAMS="$PARAMS --fieldPrefix $2" ; shift 2 ;;
-Z|--core) CORE="$2" ; shift 2 ;;
-Y|--file-path) FILE_PATH="$2" ; shift 2 ;;
-X|--file-mask) FILE_MASK="$2" ; shift 2 ;;
Expand Down
Loading

0 comments on commit 74c0a51

Please sign in to comment.