Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: AzStorageBlobReader Integration with LlamaParse #14395

Open
1 task done
galvangoh opened this issue Jun 26, 2024 · 1 comment
Open
1 task done

[Question]: AzStorageBlobReader Integration with LlamaParse #14395

galvangoh opened this issue Jun 26, 2024 · 1 comment
Labels
question Further information is requested stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@galvangoh
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

I am currently using LlamaParse as a file_extractor together with AzStorageBlobReader to read PDFs in blob storage and parse the document at the same time. Currently, this integration only allows .load_data() that outputs the parsing result either as text or markdown. Is there a way to also output the parsing result in JSON - like same as get_json_result() as mentioned in this issue from LlamaParse?

Thank you.

@galvangoh galvangoh added the question Further information is requested label Jun 26, 2024
Copy link

dosubot bot commented Jun 26, 2024

Yes, there is a way to output the parsing result in JSON using the get_json_result() method when using LlamaParse as a file_extractor together with AzStorageBlobReader to read PDFs in blob storage. The get_json_result() method is defined in the DashScopeParse class, which is part of the LlamaIndex framework.

Here is the relevant code snippet for the get_json_result() method:

def get_json_result(
        self, file_path: Union[List[str], str], extra_info: Optional[dict] = None
    ) -> List[dict]:
        extra_info = {"parse_fmt_type": ResultType.DASHSCOPE_DOCMIND.value}
        """Parse the input path."""
        try:
            return asyncio.run(self.aget_json(file_path, extra_info))
        except RuntimeError as e:
            if nest_asyncio_err in str(e):
                raise RuntimeError(nest_asyncio_msg)
            else:
                raise

This method can be used to parse the input file path and return the result in JSON format [1].

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
None yet
Development

No branches or pull requests

1 participant