Conversation
|
Hi @alexey-pelykh, |
316b3cd to
eca64c3
Compare
eca64c3 to
7a4d63f
Compare
7a4d63f to
c5f7aa5
Compare
c5f7aa5 to
6f148f3
Compare
|
There hasn't been any activity on this pull request in the past 4 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. |
276af41 to
6f148f3
Compare
6f148f3 to
a914254
Compare
alexey-pelykh
left a comment
There was a problem hiding this comment.
Thanks for tackling this — banks exporting XLS files that are actually HTML is a common pain point. The detection via lstrip().startswith("<html>") and conversion through lxml is a pragmatic solution.
One minor thing: is_HTML doesn't follow Python naming — should be is_html. Not blocking.
Code review LGTM.
| _("No valid encoding was found for the attached file") | ||
| ) from None | ||
| decoded_file = data_file.decode(detected_encoding) | ||
| is_HTML = decoded_file.lower().lstrip().startswith("<html>") |
There was a problem hiding this comment.
Nit: is_HTML → is_html per PEP 8 (snake_case for local variables).
|
I'm no more working on this one because I have moved to |
I recently downloaded an XLS file that was actually an HTML file (they exist, check the file in tests/fixtures!) and the module account_statement_import_sheet_file wasn't able to import it.
With this PR, it is possible to import such files.
Most of the README edits happened automatically running pre-commit.