Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image extraction to PDF. Polish code #37

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

dSupertramp
Copy link
Contributor

Hi everyone! Danilo's here

I added the extraction of the images from PDF (images are saved in a local folder)

I also cleaned the code a little bit

Thanks for all!

@StanGirard
Copy link
Contributor

Hey @dSupertramp this is awesome! Could you try to use something else than FITZ ? Its license is AGPL, meaning it can't be used in production if you don't have a commercial license with them.

@dSupertramp
Copy link
Contributor Author

Hey @dSupertramp this is awesome! Could you try to use something else than FITZ ? Its license is AGPL, meaning it can't be used in production if you don't have a commercial license with them.

Done! I used pypdf

@StanGirard
Copy link
Contributor

@chloedia @AmineDiro can you test ? :)

@StanGirard
Copy link
Contributor

@dSupertramp it seems you forgot to add it to the dependencies ;)

Copy link
Collaborator

@AmineDiro AmineDiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR ! Nice work, small remarks on some minor changes.

megaparse/Converter.py Outdated Show resolved Hide resolved
@@ -19,6 +19,7 @@ poppler-utils = "*"
langchain-openai = "*"
langchain-core = "*"
python-dotenv = "*"
pypdf = "*"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably fix the version

megaparse/Converter.py Outdated Show resolved Hide resolved
@dSupertramp
Copy link
Contributor Author

Everything solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants