Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow extraction of plain text body from HTML body #278

Open
TheElementalOfDestruction opened this issue Jul 11, 2022 · 3 comments
Assignees
Labels
Accepted This feature request has been accepted and will be developed enhancement

Comments

@TheElementalOfDestruction
Copy link
Collaborator

Code should attempt to extract the plain text body from the HTML body if the HTML body exists but the plain text body does not.

@TheElementalOfDestruction TheElementalOfDestruction added enhancement Accepted This feature request has been accepted and will be developed labels Jul 11, 2022
@grahamperrin

This comment was marked as resolved.

@TheElementalOfDestruction
Copy link
Collaborator Author

I actually specifically looked at and then noted major issues with textract on the discord (the issues explain them partially) that would prevent this from being used. The major one is that it is not plain text as it claims but rather us formatted text, which is not what this module does.

@TheElementalOfDestruction
Copy link
Collaborator Author

TheElementalOfDestruction commented Aug 11, 2022

Correction, sorry I mixed this up with another module I was looking at because I had been looking at this one for other reasons. No, the biggest problem with trying to use it is that it depends on extract-msg, specially a very old version of it, which could cause severe conflicts. It uses 0.29.* which is the last python 2 supporting version.

https://discord.com/channels/769278425835372555/769278426262536215/986885672071733249
https://github.com/deanmalmgren/textract/blob/master/requirements/python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted This feature request has been accepted and will be developed enhancement
Projects
None yet
Development

No branches or pull requests

2 participants