Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing Results from PDF Scanning #2

Open
pecknigel opened this issue Sep 7, 2020 · 2 comments
Open

Processing Results from PDF Scanning #2

pecknigel opened this issue Sep 7, 2020 · 2 comments

Comments

@pecknigel
Copy link
Contributor

pecknigel commented Sep 7, 2020

I'm trying to make use of this and may be missing something. I get:

TypeError: Cannot read property '0' of undefined
    at init (.../node_modules/line-segmentation-gcp-vision-ocr/index.js:12:37)

It seems that it is looking for data.textAnnotations[0] but the data I'm getting back from GC Vision is structured as:

{ fullTextAnnotation: [Object], context: [Object] }

Has the data format changed? Am I missing something here?

Thanks.

@ionistrati
Copy link
Contributor

ionistrati commented Sep 7, 2020

hi @nigelbpeck, google vision ocr provides this list of keys in response:

[
  'faceAnnotations',
  'landmarkAnnotations',
  'logoAnnotations',
  'labelAnnotations',
  'textAnnotations',
  'localizedObjectAnnotations',
  'safeSearchAnnotation',
  'imagePropertiesAnnotation',
  'error',
  'cropHintsAnnotation',
  'fullTextAnnotation',
  'webDetection',
  'productSearchResults',
  'context'
]

fullTextAnnotation contains only the text response using google's algorithm. GC Vision textDetection and documentTextDetection both provide the full response as I gave the example above.
you should initialize your client like this:

const vision = require('@google-cloud/vision');
const client = new vision.ImageAnnotatorClient({
    // Proiect ID from Google Cloud Vision
    projectId: 'your-project-id',
    // Keyfile from Google Cloud Vision
    keyFilename: './google-vision-api-keyfile.json',
});

and call the textDetection from your defined client:
data = client.textDetection(filePath);
or
data = client.documentTextDetection(filePath);
inside of data you should have the full response. (maybe in an array with a single object)

please tell me if you had initialized the GC Vision OCR differently or if you get a different response doing the same steps, so I can help you.

@pecknigel
Copy link
Contributor Author

pecknigel commented Sep 8, 2020

Hi @ionistrati ,

Thanks for getting back to me.

I'm going off the approach recommended for detecting text in files on the Google Cloud Vision API documentation (link below), which recommends using asyncBatchAnnotateFiles() and not client.textDetection(filePath) or client.documentTextDetection(filePath).

This seems to be because my source is a PDF rather than an image.

Detect text in files (PDF/TIFF)
https://cloud.google.com/vision/docs/pdf

Submitting a one page PDF with:

    const [operation] = await gcVisionClient.asyncBatchAnnotateFiles({
        requests: [
            {
                inputConfig: {
                    // Supported mime_types are: 'application/pdf' and 'image/tiff'
                    mimeType: 'application/pdf',
                    gcsSource: {
                        uri: `gs://${bucketName}/${fileName}`,
                    },
                },
                features: [{type: 'DOCUMENT_TEXT_DETECTION'}],
                outputConfig: {
                    gcsDestination: {
                        uri: `gs://${bucketName}/${resultsOutputFolder}`,
                    },
                },
            },
        ],
    });

I get back the following data:

{
  fullTextAnnotation: {
    pages: [
             {
               property: { detectedLanguages: [] },
               width: 746,
               height: 829,
               blocks: [ {}, {}, {}, ... ]
             }
           ],
    text: 'Many lines of text\n' +
      'Many lines of text\n' +
      '...\n'
  },
  context: {
    uri: 'gs://bucket-name/test.pdf',
    pageNumber: 1
  }
}

So it only has fullTextAnnotation and no textAnnotations.

@pecknigel pecknigel changed the title Change in Data Format? Processing Results from PDF Scanning Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants