Skip to content

Commit ce4f11e

Browse files
authored
Update README.md
1 parent 240d9fb commit ce4f11e

File tree

1 file changed

+12
-10
lines changed

1 file changed

+12
-10
lines changed

Data model/README.md

+12-10
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,21 @@
11
## GallicaPix data model ##
22

33
The model is document driven:
4-
- a GallicaPix database is composed of documents
5-
- a document is a list of pages
6-
- a pages may include illustrations
7-
- an illustration is characterised by several properties (size, caption, technique, function, etc.) and may include visual contents or textual contents
8-
- a visual content is related to an object, a concept or a color present in the illustration
9-
- a textual content describes the texts present in the illustration or arranged around the illustration and related to it
4+
- a GallicaPix database is composed of documents, described through their bibliographical metadata,
5+
- a document is a list of ordered pages,
6+
- a pages may include illustrations,
7+
- an illustration is characterised by several descriptors of different types (technical: size, position, color mode...; iconographic: technique, function, genre...; semantic: caption, subject, theme...). An illustration may include visual contents or textual contents,
8+
- a visual content is related to an object, a concept or a color present in the illustration,
9+
- a textual content describes the texts present in the illustration or arranged around the illustration and linked to it,
1010
- some of these elements may have a geometric positioning in relation to the page or the illustration.
1111

12-
Bibliographical metadata are extracted from the Gallica OAI are stored at the document level. The generally are Dublin Core like metadata.
12+
Bibliographical metadata are extracted from the Gallica OAI-PMH repository are stored at the document level. The generally are Dublin Core like metadata.
1313

1414
Illustration related metadata are either surfaced by the BnF catalog, infered from other metadata or infered with trained ML models. They are stored at the illustration level:
15-
- technique used to produce the illustration (Intermarc: zone 285)
16-
- function of the illustration (Intermarc: zone 646)
17-
- genre of the illustration (Intermarc: zone 641)
15+
- technique used to produce the illustration (in the BnF catalog, Intermarc zone #285)
16+
- function of the illustration (#646)
17+
- genre of the illustration (#641)
1818

1919
[Intermarc reference](https://www.bnf.fr/fr/referentiels-intermarc)
20+
21+
The infered metadata are also characterised by their source (human production, models or tools) and their confidence score.

0 commit comments

Comments
 (0)