Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with half an image produces different results than with a selection of that same image #1203

Open
JoyfulGen opened this issue Sep 5, 2024 · 3 comments

Comments

@JoyfulGen
Copy link
Contributor

There are two ways to use only part of a folio image in Pixel: either you cut the image beforehand and give Pixel the pre-cut image, or you give Pixel the whole image and select only a part of it to be submitted back to rodan. I had a theory that these two processes produced different results; I went out to test that theory and have emerged even more bamboozled than before! Oh well. Here are the results of my test:

I first did a Pixel run using a cropped image of only the two lower staves of folio 288. I then did a Pixel run using the full image of 288, but selected the exact same region as the cropped image to submit to rodan. I then tested the models produced by each Pixel run on both the full image and the cropped one. In both cases, the result of each model was the same for both images.

  • The cropped image models produced perfect layers.
  • The full image models produced an impeccable layer 2 (staff lines), but then put all of the text (should be layer 3) in layer 1 with the neumes. Layer 3 was completely empty. (Hence the bamboozlement)

Important point: I used the Salzinnes models to separate the layers in all four Pixel runs. They do fairly well! However, the Salzinnes models do not have a layer 3; instead, the text is included in the background layer. So maybe this affected the results? But why would the models then put the text in layer 1, instead of in the background? And why did it only happen in one of the two cases? Help.

These are the cropped image models:
Background Model Salzinnes model 4of4.hdf5.zip
Model 1 Salzinnes model 4of4.hdf5.zip
Model 2 Salzinnes model 4of4.hdf5.zip
Model 3 Salzinnes model 4of4.hdf5.zip

And these are the full image models:
Background Model 288 w_ Salzinnes models.hdf5.zip
Model 1 288 w_ Salzinnes models.hdf5.zip
Model 2 288 w_ Salzinnes models.hdf5.zip
Model 3 288 w_ Salzinnes models.hdf5.zip

@kyrieb-ekat
Copy link

This is curious- I've noticed increasingly that the models showing the #1204 layer issue includes the text in the background layer, or doesn't recognize it at all. The result is that the third layer ends up empty, or doesn't empty at all reminiscent of the original issue #1196 , where when you go to open a layer to view it you get a server error. We may need to reopen or open a specific issue for this. However, I think it's connected to whatever layer separation shenanigan is going on here...

@JoyfulGen
Copy link
Contributor Author

Update! I did the same test again without using the Salzinnes models, to see whether that did indeed influence the results or not.

So I produced two new models, one trained with a cropped part of folio 288 and one trained with a selection of the full image of folio 288. To be clear, the selection I made of the full image is exactly the same as the cropped section of the image; the only difference is that one image was cropped before arriving in Pixel, and the other was cropped in Pixel.

My results this time are the opposite of my first results (boo). This time, the model trained on the selection of the full image did much better than the one trained on the cropped image.

For example, here are two layer 1's of folio 288. The one on the left was separated with the full image model, and the one on the right was separated with the cropped image model:

layer 1 comparison 288

You can see that the one on the left is correct but speckly, whereas the one on the right is cleaner, but it contains the text and the staff lines too (!?).

Here's another example with layer 1's of folio 195:
layer 1 comparison 195

Ok so these are both pretty bad, but the one on the right has more elements that should be in other layers.

Also, all tests I've done with the cropped image model so far have produced completely empty text layers.

This is all extremely mysterious, because a) it contradicts my first results and b) @kyrieb-ekat has managed to make very friendly models for the end of MS73 that include cropped images.

@kyrieb-ekat
Copy link

Okay so I've run into similar things. The most common weird thing that shows up is where the edge of a patch (or something) lifts up a chunk of whatever is around it, which is when you end up with missing test lines like that (this can also affect music etc). Otherwise the most common error I've seen in this is the 'combined layer' effect where text gets pulled through as well; in the other layers where only the divisio lines are there I can confidently say that's only because the page has red divisio, not black, and it doesn't know how to handle that.

Otherwise- weeeeeeird! And maddening; that it was the same run replicated from 0 and produced differing results is Not Good. I wonder if my success End Models is partially because I only trained on two cropped images of the lot..? Similarly we might need to check with @homework36 what container health or space looked like at the time you rain the second group. Usually after Rodan "wakes up" it does better, so it's odd the latter of the pair did worse!! Gaaaah....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants