PDF2

A package for inspecting PDF files.

It is at an early stage of development.

Goal

The current aim of this package is to implement the following features:

Parse PDF files
Validate PDF files
Extract metadata
Extract text, images, tables, links, annotations...
Check for potential security vulnerabilities

References

References to the International Standard ISO 32000-2:2020 (PDF 2.0) Portable document format – Part 2: PDF 2.0 are included in the comments and documentation. These are indicated by the section number, name, and page number(s) in square brackets, e.g. [7.3.10 Indirect objects, p33-34]. Nested square brackets indicate references to other sources, e.g. [[https://www.w3.org/TR/png/#4Concepts.EncodingScanlineAbs] 4.6.2 Scanline serialization].

Needed Help

If you are interested in contributing, please check the TODO list. Contributions to tests with extracts of PDF files that do not open correctly are highly appreciated, provided they do not require a change to the LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github/workflows		.github/workflows
bin		bin
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.rustfmt.toml		.rustfmt.toml
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE.txt		LICENSE.txt
README.md		README.md
TODO.md		TODO.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF2

Goal

References

Needed Help

About

Releases

Packages

Languages

License

anwaralameddin/pdf2

Folders and files

Latest commit

History

Repository files navigation

PDF2

Goal

References

Needed Help

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages