-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error on reading files where <c> element misses the "r" attribute #465
Comments
yes I have the exact same problem |
You are right, the r attribute is optional. We should check how this is handled in Apache POI. For example, what happens if some elements miss the r attribute, but not all? |
We have such a case as well. Interestingly, opening the seemingly broken file with Excel or libreoffice calc and saving it again results in fastexcel successfully processing the file. Looks like those programs will add the optional r attribute. |
Inspired by the apache poi approach
Tracks both row and column indices to allow falling back to those to determine row and/or cell address for Excel files that do not provide the (optional) reference attribute 'r'. Inspired by the apache poi approach.
Opening the sample file provided by @fabiospiga and saving it also seems to fix the file in a way that let's fastexcel process it successfully (same as with our file). The FastExcelReaderTest that compares certain attributes of the parsed file when read with fastexcel with the same file being read with apache poi. PR #514 fully fixes the issue we have experienced with our file. Adding it to the FastExcelReaderTest passes (after increasing byte array max override enough). However, the file provided by Fabio does not pass that test, even with the changes in PR #514. But also there, the initial exception is gone:
I investigated to some extent to apply further fixes to let fastexcel process that file with the same outcome as with apache poi but I stopped. I wanted to add a file to the FastExcelReaderTest but therefore failed to do so, as our file contains confidential customer data, and Fabios file still does not pass the test. Maybe somebody else could contribute a file that can be added to the PR as a test case. |
I need to read a file where the element
<row>
and<c>
miss the "r" attribute, that is apparently optional in the OpenXML structure.Here's an example:
https://www.atih.sante.fr/sites/default/files/public/content/3968/fichier_complementaire_ccam_descriptive_a_usage_pmsi_2021_v2.xlsx
The raised exception is:
Cannot invoke "java.lang.Integer.intValue()" because the return value of "org.dhatim.fastexcel.reader.SimpleXmlReader.getIntAttribute(String)" is null
because
int rowIndex
cannot be unboxed atorg.dhatim.fastexcel.reader.RowSpliterator#next
Could you please provide support for this use case?
Thanks and best regards,
Fabio
The text was updated successfully, but these errors were encountered: