Skip to content

0xthierry/layout-document-ai

Repository files navigation

Layout Document AI

Description

Layout Document AI is a tool designed to process documents using Google Cloud's Document AI, preserving the original layout. It reads JSON files generated by Document AI, processes the text while maintaining the layout, and outputs the text into .txt files.

Features

  • Process Document AI JSON outputs
  • Preserves the original layout of the document.
  • Generate formatted text files

Installation

  1. Install dependencies:
pnpm install

Usage

  1. Place your Document AI JSON files in the document-ai-json directory.

  2. Run the main script to process the documents:

pnpm run dev
  1. The processed text files will be saved in the document-ai-text directory.

Contributing

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add some feature').
  5. Push to the branch (git push origin feature-branch).
  6. Open a pull request.

License

This project is licensed under the ISC License. See the LICENSE file for details.

Author

Thierry Santos - [email protected]

Acknowledgements

About

Document AI preserving the original layout.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published