Skip to content

magic_pdf-0.8.0-released

Compare
Choose a tag to compare
@myhloli myhloli released this 10 Sep 12:20
· 558 commits to master since this release
9f352df

What's Changed

feat:

  • Add RAG API
  • Integration of RAG into llama_index project
  • Update Dockerfile
  • Fine grained model singleton, reducing memory usage and accelerating initialization speed
  • CLI and API add parsing range parameters, allowing customization of start and end pages
  • Support image footnotes

bugfix:

  • When removing the smaller overlapping block, retain the boundary information of that block
  • Fill in the threshold of 0.6->0.3 for the span block
  • The problem of losing low score lines in OCR DET stage
  • Merge multiple spans of a single line in the OCR DET stage
  • Optimization of English Adhesive Word Segmentation Logic
  • Inaccurate layout box issue
  • The problem of merging words after being broken by line breaks
  • The final output result contains certain special characters

Full Changelog: magic_pdf-0.7.1-released...magic_pdf-0.8.0-released