Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pixOrientDetect with .bmp file #733

Open
saikumarkavali1 opened this issue Feb 7, 2024 · 6 comments
Open

pixOrientDetect with .bmp file #733

saikumarkavali1 opened this issue Feb 7, 2024 · 6 comments

Comments

@saikumarkavali1
Copy link

I'm using below Leptonica Api [1.81.1] for Orientation detect and rotate the file according to the text.

pixOrientDetect(new HandleRef(pix, pixconv), out pupconf, out pleftconf, 0, 0);

For a particular BMP file shared below, it is returning wrong outputs so that Rotation is not expected one.

Input_File.zip

@saikumarkavali1
Copy link
Author

To get the pixconv value I have used below Api and 130 is the random threshold value.
var pixconv = Tesseract.Interop.LeptonicaApi.Native.pixConvertTo1(pix.Handle, 130);

With threshold 255, my issue was resolved but I have doubt about threshold value.
How to know the threshold value of any file.

@DanBloomberg
Copy link
Owner

This is working as expected. There is a note in pixUpDownDetect() that the image should have a resolution between 150 and 300 ppi. This note is easy to miss and should have been put in a more obvious place, and I will do that.

Your image was made at a high resolution of 600 ppi., so I scaled it down with a scalefactor of 0.35, binarized it with a threshold of 128 (a reasonable value to use for a clean scan such as yours), and ran pixOrientDetect() on it:

    pix1 = pixRead("/tmp/input.bmp");   // resolution 600 ppi
    pix2 = pixScale(pix1, 0.35, 0.35);   // reduces resolution to about 210 ppi
    pix3 = pixConvertTo1(pix2, 128);
    pixOrientDetect(pix3, &upconf, &leftconf, 0, 0);
    lept_stderr("upconf = %f, leftconf = %f\n", upconf, leftconf);

with the result:

   upconf = 14.350754, leftconf = 1.302541

which says there is a very high confidence that it is rightside up.

@saikumarkavali1
Copy link
Author

Thank you DanBloomberg for sorting this out.

Your inputs helped me to solve my issue.
Any alternate approach for finding the threshold value for a file rather than some reasonable value like 128.

@DanBloomberg
Copy link
Owner

There are many functions that adapt the local threshold based on a measurement of the background value.
This is done by first normalizing the background to a constant value.
For the full set of normalizing functions, see adaptmap.c.

Adaptive binarization is done in two steps:
(1) Background normalization by some method
(2) Global thresholding with a value appropriate to the normalization.

There are several high level functions in leptonica for doing adaptive binarization on grayscale and color images, such as:

  pixAdaptThresholdToBinarypix, NULL, 1.0)   (in grayquant.c)
  pixConvertTo1Adaptive(pix)                           (in pixconv.c)
  pixCleanImage(pix, 1, 0, 1, 0)                        (in pageseg.c)

@saikumarkavali1
Copy link
Author

Thank you @DanBloomberg for the information.

I have a query on below Api's

 pixOrientDetect(pix, &upconf, &leftconf, 0, 0);
 pixConvertTo1(pix, 128);
  1. I have applied a Red color stamp and used above API's for rotation. Result: Stamp color changed to Blue.
  2. Again I applied another Red color stamp and used same API's. Result: Stamp1 color changed to Red and Stamp2 changed to Blue.

File:
Input_File.zip

@DanBloomberg
Copy link
Owner

I don't know what you are doing. Before you can use the orientation detector, you must convert to 1 bpp with resolution between 150 and 300, as I have previously described. This gets rid of the color.

So if you do this:

    pix1 = pixRead("input.bmp");   // 600 ppi, 8 bpp, 256 colors
    pix2 = pixScale(pix1, 0.35, 0.35);    // down to about 210 ppi, 32 bpp rgb  (colormap is removed)
    pix3 = pixConvertTo1(pix2, 128);     // 1 bpp
    pixOrientDetect(pix3, &upconf, &leftconf, 0, 0);   // only works on 1 bpp images
    lept_stderr("upconf = %f, leftconf = %f\n", upconf, leftconf);

the result is:

upconf = 14.350754, leftconf = 1.302541

which says that the orientation is correct as it is. This is finding a global angle for rotation; it is essentially ignoring the stamp, which the algorithm sees as a blob of pixels without any small text orientation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants