AzureAIDocumentIntelligence : Analyze request Parameter in begin_analyze_document throwing error #36434

sakshi1989 · 2024-07-11T14:37:03Z

Package Name:: azure-ai-documentintelligence:
Package Version: : 1.0.0b3:
Operating System: Windows:
Python Version : 3.11:

Describe the bug
The begin_analyze_document method of the DocumentIntelligenceClient is throwing the below error -
HttpResponseError: (InvalidArgument) Invalid argument. Code: InvalidArgument Message: Invalid argument. Inner error: { "code": "ParameterMissing", "message": "The parameter urlSource or base64Source is required." }
I am trying this resource as per the example - Example

To Reproduce
Steps to reproduce the behavior:

Install azure-ai-documentintelligence library
Import the libraries -
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
Use the endpoint and the api key to use the azure-ai-documentintelligence resource.
Use any local PDF file
Write the below code -
document_intelligence_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(api_key))

with open(path_to_sample_documents, "rb") as f:

print(type(f))

poller = document_intelligence_client.begin_analyze_document(model_id="prebuilt-layout",
                                                             analyze_request=f,
                                                             output_content_format="markdown")

result : AnalyzeResult = poller.result()

Expected behavior
The document should have got processed

Additional context
The same issue was opened here, but it was closed as no response was received from the person who opened the issue.
Existing Issue

After debugging the internal code of the library it is failing at line 507
pipeline_response: PipelineResponse = self._client._pipeline.run( # pylint: disable=protected-access _request, stream=_stream, **kwargs )
at path Internal Code

The text was updated successfully, but these errors were encountered:

github-actions · 2024-07-11T14:37:42Z

Thank you for your feedback. Tagging and routing to the team member best able to assist.

YalinLi0312 · 2024-07-12T19:27:26Z

Hi @sakshi1989 , can you try with passing parameter content_type?

print(type(f))

poller = document_intelligence_client.begin_analyze_document(model_id="prebuilt-layout",
                                                             analyze_request=f,
                                                             output_content_format="markdown",
                                                             content_type="application/octet-stream",)

Or passing in a AnalyzeDocumentRequest:

print(type(f))

poller = document_intelligence_client.begin_analyze_document(model_id="prebuilt-layout",
                                                             AnalyzeDocumentRequest(bytes_source=f.read()),
                                                             output_content_format="markdown",)

Thanks

sakshi1989 · 2024-07-12T21:27:39Z

Thanks @YalinLi0312 for your reply. Your first suggestion works, but not the second one. With the second one I am receiving below error -
TypeError: cannot pickle '_io.BufferedReader' object

Also, I would like to point out that in documentation there is nowhere mentioned to have the parameter content_type="application/octet-stream", it would be good to mention the same in the documentation so that users don't run into the same error. in the example, yes this parameter is mentioned but not in the documentation.

github-actions · 2024-07-12T21:43:14Z

Hi @sakshi1989. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

YalinLi0312 · 2024-07-12T22:14:28Z

Hi @sakshi1989, thanks for the feedback! We'll mention the type in our documentation.
For the second one, it should pass f.read() instead. I updated in the previous comment, please try it again.

sakshi1989 · 2024-07-15T12:27:01Z

model_id="prebuilt-layout",
                                                             AnalyzeDocumentRequest(bytes_source=f.read()),
                                                             output_content_format="markdown",

Hi @YalinLi0312 , thank you for your quick response on the issue. Yes, now both the options are working.

NPap0 · 2024-07-29T13:19:42Z

Just encountered the same problem, thankfully I came across this issue, used the f.read() solution and it worked great!

github-actions bot assigned YalinLi0312 Jul 11, 2024

YalinLi0312 added the needs-author-feedback More information is needed from author to address the issue. label Jul 12, 2024

github-actions bot removed the needs-team-attention This issue needs attention from Azure service team or SDK team label Jul 12, 2024

sakshi1989 closed this as completed Jul 15, 2024

github-actions bot added needs-team-attention This issue needs attention from Azure service team or SDK team and removed needs-author-feedback More information is needed from author to address the issue. labels Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AzureAIDocumentIntelligence : Analyze request Parameter in begin_analyze_document throwing error #36434

AzureAIDocumentIntelligence : Analyze request Parameter in begin_analyze_document throwing error #36434

sakshi1989 commented Jul 11, 2024 •

edited

Loading

github-actions bot commented Jul 11, 2024

YalinLi0312 commented Jul 12, 2024 •

edited

Loading

sakshi1989 commented Jul 12, 2024 •

edited

Loading

github-actions bot commented Jul 12, 2024

YalinLi0312 commented Jul 12, 2024

sakshi1989 commented Jul 15, 2024

NPap0 commented Jul 29, 2024 •

edited

Loading

AzureAIDocumentIntelligence : Analyze request Parameter in begin_analyze_document throwing error #36434

AzureAIDocumentIntelligence : Analyze request Parameter in begin_analyze_document throwing error #36434

Comments

sakshi1989 commented Jul 11, 2024 • edited Loading

github-actions bot commented Jul 11, 2024

YalinLi0312 commented Jul 12, 2024 • edited Loading

sakshi1989 commented Jul 12, 2024 • edited Loading

github-actions bot commented Jul 12, 2024

YalinLi0312 commented Jul 12, 2024

sakshi1989 commented Jul 15, 2024

NPap0 commented Jul 29, 2024 • edited Loading

sakshi1989 commented Jul 11, 2024 •

edited

Loading

YalinLi0312 commented Jul 12, 2024 •

edited

Loading

sakshi1989 commented Jul 12, 2024 •

edited

Loading

NPap0 commented Jul 29, 2024 •

edited

Loading