(stoa0022a) by AlisonBabeu · Pull Request #387 · OpenGreekAndLatin/csel-dev

AlisonBabeu · 2022-04-12T17:31:44Z

Converted stoa0040a (Pseudo-Augustine) to stoa0022a *(Ambrosiaster), re long ago issue #305.
While had created catalog_data for this edition at the time had no way to change the data.
Renumbered file and also combined the two-only one work.

Converted stoa0040a (Pseudo-Augustine) to stoa0022a *(Ambrosiaster), re long ago issue #305. While had created catalog_data for this edition at the time had no way to change the data. Renumbered file and also combined the two-only one work.

lcerrato · 2022-04-12T19:15:20Z

@AlisonBabeu
It looks like you added books to stoa0022a.stoa001.opp-lat1.xml, is that correct? There are some oddly named divs in this file.
This isn't going to work as you've added another level (book) without editing the refs decl or making the old file = book 1.
It's not the just the naming of the sections that is causing the problems.

The comments are also unclear. I'm not sure what files were added or when.
This should probably be a detailed note in the header or in the change log rather than comments in the body.

<!-- End of Work 1-previous TEI-XML files -->
  <!-- Beginnin of SEcond XML file -->

AlisonBabeu · 2022-04-12T19:34:25Z

hi @lcerrato those comments were just for myself as I was combining two XML files I had planned to do further updating.
I hadn't asked for your review yet as I always do because I wanted to see if it worked and if I could figure out fixing it myself before bothering you! :)

This is all one work but there were two XML files, one had the main body and the other had what I guess for lack of a better term are appendices, its very complicated. When it first failed I tried changing the names of the sections, the previous names of the two new sections were originally "neu" and "old" or something like that. I spent the better part of an hour trying to think about how to name the next two divs or best combine this file.

lcerrato · 2022-04-12T19:39:51Z

@AlisonBabeu
Let me know if you want to revisit the editing.
For instance, you can name the divs appendix1 or appendix_A orapp_1 etc.

Something that reflects that these are appendices is better than making them books 2 and 3, if they are not books 2 and 3. And would be much better than neu or old. (They could be appendix_new etc.)

You cannot have a . in an n attribute. That was part of the initial issue.

Added change log.

AlisonBabeu · 2022-04-13T14:00:59Z

hi @lcerrato so I changed the divs back to appendix1 and 2 and added a change log but at this point I'm not sure how to fix it. I see in the Travis message for the broken build Forbidden characters found: 'appendix2.29 '" Forgive my ignorance but I can't find what is mean by this in the file anywhere. I know this exact string is not in TEI-XMl file so what am I missing.

lcerrato · 2022-04-13T18:12:35Z

@AlisonBabeu
It won't pass until the refsDecl reflects the correct file structure.

So, first, you need to enclose the original file in a book level div. You've added on two new books but the initial work now needs to be in the same level. So you need a n="main" or n="0" or something to designate the text you had there before.

Then the refsDecl needs to add this level.

lcerrato · 2022-04-13T18:17:05Z

@AlisonBabeu
You also have checked in a /.directory file that should be deleted.

lcerrato · 2022-04-13T18:18:57Z

@AlisonBabeu
Forbidden characters found: [215]() stoa0022a.stoa001.opp-lat1.xml 'appendix2.29 '

This means there is something in the markup that is not allowed. In this case, there is a space after 29 as <div n="29 "

This may be because the sections have a b and the b was deleted, leaving a space?

lcerrato · 2022-04-13T18:32:31Z

This is an example of a three level refsDecl. Some of these older texts use (.+) instead of (\w+). For our purposes, they are interchangeable.

It always has to be written from smallest chunk to largest.

 <refsDecl n="CTS">
            <cRefPattern n="part" matchPattern="(\w+).(\w+).(\w+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1']/tei:div[@n='$2']/tei:div[@n='$3'])">
                <p>This pointer pattern extracts poem, line and part.</p></cRefPattern>
            <cRefPattern n="line" matchPattern="(\w+).(\w+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1']/tei:div[@n='$2'])">
                <p>This pointer pattern extracts poem and line.</p></cRefPattern>
            <cRefPattern n="poem" matchPattern="(\w+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1'])">
                <p>This pointer pattern extracts poem.</p></cRefPattern>
        </refsDecl>

A simple version for your file would be

<refsDecl n="CTS">
        <cRefPattern n="section" matchPattern="(.+).(.+).(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div[@type='edition']/tei:div[@n='$1']/tei:div[@n='$2'])/tei:div[@n='$3'])"/>
        <cRefPattern n="chapter" matchPattern="(.+).(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div[@type='edition']/tei:div[@n='$1']/tei:div[@n='$2'])"/>
        <cRefPattern n="book" matchPattern="(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div[@type='edition']/tei:div[@n='$1'])"/>
      </refsDecl>

Fixed forbidden character. Fire in the hole.

AlisonBabeu · 2022-04-20T15:14:52Z

So @lcerrato I updated the refsDecl and enclosed the first text in a book level div. This has graduated me to a new set of errors:

Unique nodes found by XPath
Word Counting
Passage level parsing
Empty References

At this point I must admit I'm stumped.

lcerrato · 2022-04-20T17:21:36Z

@AlisonBabeu
Unfortunately, this is an unhelpful set of error messages.

Because the count is only picking up book and chapter (note the Nodes are 3;246 when you want three digits there), the problem is in the section level. Although I no longer use this refsDecl format, I think that is ok (that's always a place to look — and it's impossible to spot issues there sometimes). It might not hurt to copy/paste a refsDecl from a working three level text in case mine was wrong.

There are may be chapters that do not have sections, but this would mean that the appendices should have been passing previously. (I would have to look back at that structure to see how that was handled).

This can be spotted by using the outline view in oxygen.

At first glance, it looks like there are a lot of things in the appendices that aren't chapters at all but rather chapter headers? Again, I would need to see what the previous version looked like to know if this is legit markup or just an oversight.

lcerrato · 2022-04-20T17:37:24Z

@AlisonBabeu

For instance,

<div n="appendix1" subtype="book" type="textpart">
 <head>
  <title type="main">QVAESTIONES [SANCTI AUGUSTINI] DE UETERI ET NOUO TESTAMENTO. </title>
 </head> 
<div n="1a" subtype="chapter" type="textpart">
<ab> <title>I. </title> </ab>
          <p>I huius recensionis = I recensionis quaestionum numero CXXVII. 
</p>
</div>

here https://archive.org/details/corpusscriptoru16wiengoog/page/419/mode/2up?view=theater

That is not a chapter. There is a "book/appendix" title, then that's just an explanation of the title, so something like a subtitle. But because that has no section, it's not going to pass.

On the top of this page https://archive.org/details/corpusscriptoru16wiengoog/page/421/mode/2up?view=theater

there is
III—XII = II-XI
XIII = XXXV
XIIII-XXXVI = XII-XXXIIII

which has also been encoded as chapters. That's probably more of a notation, just telling the users that there are no differences here from the main text in these sections. So it's debatable that that would be made "chapters" for markup purposes. It's a can of worms, really.

I would guess that's what's breaking this.

AlisonBabeu · 2022-04-20T17:52:41Z

@lcerrato I give up at this point in all honesty, I'm going to split the files back up I think and just make them two works, rather than try and fight this out. I didn't change any of the encoding that you are reference, just added letters into the divs to avoid the duplicate node problem. Thanks for all the time you've spent.

lcerrato · 2022-04-20T17:55:03Z

@AlisonBabeu
Give me a chance to take a closer look — I can't test anything until I pull it all offline.

I didn't think you changed that encoding (it was there). I was just looking at the original files to see if they passed like this.

lcerrato · 2022-04-20T18:05:00Z

@AlisonBabeu
I think it's just the refsDecl. The old one from the deleted file is working.

AlisonBabeu

Thank you thank you for finding and fixing this @lcerrato

AlisonBabeu added 4 commits April 12, 2022 13:31

(stoa0022a)

23cc25f

Converted stoa0040a (Pseudo-Augustine) to stoa0022a *(Ambrosiaster), re long ago issue #305. While had created catalog_data for this edition at the time had no way to change the data. Renumbered file and also combined the two-only one work.

Some changes.

f8cacb6

Trying something else.

f9adbc7

And one last try before bothering Lisa.

2607a95

Changed divs to appendix1 and appendix2 again.

b716d13

Added change log.

Changed refsDecl added in another level.

75728c6

Fixed forbidden character. Fire in the hole.

(stoa0022a) bad close parethensis in new refsDecl #387

710b127

lcerrato self-requested a review April 20, 2022 18:09

lcerrato approved these changes Apr 20, 2022

View reviewed changes

AlisonBabeu commented Apr 25, 2022

View reviewed changes

AlisonBabeu merged commit 6ff80a5 into master Apr 25, 2022

Conversation

AlisonBabeu commented Apr 12, 2022

Uh oh!

lcerrato commented Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlisonBabeu commented Apr 12, 2022

Uh oh!

lcerrato commented Apr 12, 2022

Uh oh!

AlisonBabeu commented Apr 13, 2022

Uh oh!

lcerrato commented Apr 13, 2022

Uh oh!

lcerrato commented Apr 13, 2022

Uh oh!

lcerrato commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lcerrato commented Apr 13, 2022

Uh oh!

AlisonBabeu commented Apr 20, 2022

Uh oh!

lcerrato commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lcerrato commented Apr 20, 2022

Uh oh!

AlisonBabeu commented Apr 20, 2022

Uh oh!

lcerrato commented Apr 20, 2022

Uh oh!

lcerrato commented Apr 20, 2022

Uh oh!

AlisonBabeu left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lcerrato commented Apr 12, 2022 •

edited

Loading

lcerrato commented Apr 13, 2022 •

edited

Loading

lcerrato commented Apr 20, 2022 •

edited

Loading