Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include information about DDC rules that led to a composed number #22

Open
5 tasks
nichtich opened this issue Mar 17, 2021 · 5 comments
Open
5 tasks

Include information about DDC rules that led to a composed number #22

nichtich opened this issue Mar 17, 2021 · 5 comments
Labels
feature Additional functionality
Milestone

Comments

@nichtich
Copy link
Member

nichtich commented Mar 17, 2021

Requires

  • a list of rules in machine-readable format (JSON and/or Markdown)
  • a vue component to display a rule
  • a place to display all rules at the web interface
  • reference to rules in the analyze result in JSKOS
  • an icon or indicator to display rules as part of the analyze result
@nichtich nichtich added feature Additional functionality question Further discussion needed labels Mar 17, 2021
@nichtich nichtich removed the question Further discussion needed label Jul 19, 2021
@nichtich nichtich changed the title Include information about DDC rules that let to a composed number Include information about DDC rules that led to a composed number Jul 19, 2021
@nichtich
Copy link
Member Author

Each rule has a short text (derived from MARC). There should be more details (name, description, examples...) but we can start with this short string for each rule:

{
  "r1": { "short": "Unless it is redundant, add to base number<|[Aa]dd to base number" },
  "r2": { "short": "the numbers following" },
  ...
}

The list of rules can directly be served as static JSON file e.g. at /rules.json. The "place to display all rules at the web interface" is secondary, maybe postpone it until we have more understandable information about each rule.

Encoding of rules in JSKOS: I'd not make rules part of standard JSKOS (unless we have experience with other faceted classifications that make use of their own rules). So each member of memberList gets an optional field RULE with an array if strings that each identify a rule, e.g. "RULE": ["p9"] or "RULE":["p20","p5"]. The prefix p is used because we already use it and because numerical identifiers have the disadvantages that you cannot add rules in between if needed (e.g. p20a). We can later switch to URIs as identifiers but that should be coodinated with OCLC/Pansoft.

@stefandesu
Copy link
Member

stefandesu commented Jul 19, 2021

Just to clarify:

  • Rule string p20_2_7 would be turned into "RULE":["p20","p2","p7"]?
  • How to deal with something like p9->781.2-781.8?
  • Do those p-numbers correspond directly to the r-numbers of the rules? If yes, can we just use r instead so that we don't need to substitute p for r when looking up the rules?

Edit regarding the last point:

p1_2_3_17 ist ein Regelpattern (deshalb "p"): es folgen nacheinander die Regel(teile) r1 -> r2 -> r3-> r17

So "p" stands for pattern und "r" for rule. But I think if we split up the pattern, we can then use the r prefix.

@nichtich
Copy link
Member Author

To handle additional parameters of rule application, such as p9->781.2-781.8, we need to use another format, e.g. there is this rule pattern:

"p9": { "short": "Add as instructed under" }

The full "rule" (sorry, we are using muddy terminology, this needs to be clarified) could for instance

Add as instructed under 781.2-781.8

So better keep prefix p and use a more complex format for key RULE, e.g.

"RULE":[{"pattern":"p20"},{"pattern":"p2"},{"pattern":"p7"}]
"RULE":[{"pattern":"p9","parameter": "781.2-781.8"}]

I am sure there are also rule patterns with multiple parameters. If we know which rules have parameters, better store rule short text like this:

"p9": { "short": "Add as instructed under %s" }

@stefandesu
Copy link
Member

As far as I understand it, pattern p20_2_7 does not consist of patterns p20, p2, and p7 (none of those exist), but rather of rules r20, r2, and r7 applied in sequence. So technically we should use the r prefix which would also make it easier to match them to the rule definitions, as stated above.

Also as far as I can see, patterns can, but don't have to have parameters (and not sure if every pattern can have parameters). It seems like p9 can also be used without a parameter, so using substitutions like you are suggesting might not work as well.

nichtich added a commit that referenced this issue Dec 22, 2021
@nichtich
Copy link
Member Author

First information about rules is included in the rules branch. Each rule has a regular expression pattern to match subfield values in MARC21 classification 6XX fields and/or MARC21 classification 7XX fields).

An example: p20_2_7 (r20, r2, r7) is used in DDC class 789.57 Hybrid styles with MARC 761 field (excluding the examples to simplify the data):

761  0 $i Add to $b 789.57 $i the numbers following $r 781.6 $i in $d 781.62 $c 781.69

Or with reference to the list of rule elements (highly normalized):

r20, 789.57, r2, 718.6, r7, 781.62-781.69.

Or as textual building instruction:

Add to 789.57 the numbers following 781.6 in 781.62-781.69

The latter is most likely what to show in the user interface but we are not allowed to do so for license reasons (even not in German). I am not sure whether this form would be ok:

Add to ... the numbers following ... in ...

The textual instruction could be passed as plain string in JSKOS. We might also include the list of rule elements in RULE as well, e.g. with numbers for rule patterns and strings for notations:

{
  "RULE": {
    "elements": [ 20, "789.57", 2, "718.6", 7, "781.62-781.69" ],
    "pattern": "Add to ... the numbers following ... in ...",
    "instruction":  "Add to 789.57 the numbers following 781.6 in 781.62-781.69"
  }
}

@nichtich nichtich added this to the 0.4.0 milestone Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Additional functionality
Projects
None yet
Development

No branches or pull requests

2 participants