magic_pdf-0.8.0-released
myhloli
released this
10 Sep 12:20
·
558 commits
to master
since this release
What's Changed
feat:
- Add RAG API
- Integration of RAG into llama_index project
- Update Dockerfile
- Fine grained model singleton, reducing memory usage and accelerating initialization speed
- CLI and API add parsing range parameters, allowing customization of start and end pages
- Support image footnotes
bugfix:
- When removing the smaller overlapping block, retain the boundary information of that block
- Fill in the threshold of 0.6->0.3 for the span block
- The problem of losing low score lines in OCR DET stage
- Merge multiple spans of a single line in the OCR DET stage
- Optimization of English Adhesive Word Segmentation Logic
- Inaccurate layout box issue
- The problem of merging words after being broken by line breaks
- The final output result contains certain special characters
Full Changelog: magic_pdf-0.7.1-released...magic_pdf-0.8.0-released