Skip to content

Conversation

@agius
Copy link

@agius agius commented Apr 4, 2023

I found a spreadsheet in the wild with a bunch of empty-value cells.

<c r="K58" s="2" t="s">
  <v>65</v>
</c>
<c r="L58">
  <v/>
</c>
<c r="M58" t="s">
  <v>158</v>
</c>

Because there was no format specified for the cell, it had at least one child XML node, and there was no value, the default behavior was to treat the cell as a number type and try to parse the value (which was converted to empty-string). Using the kernel-level method Integer() , this threw an exception. The exception was also surprisingly hard to trace; it didn't have a full stacktrace in the wild and didn't appear in $! in IRB; only got a trace once I got it into RSpec.

Apple's "Numbers" program converts the cell to zero; Google Sheets leaves it blank. I'm not sure what Excel itself does with this kind of XML. This sheet was produced by KendoUI, some angular framework with tables, so it's probably not very common.

This change adds a nil check to the typecast, and returns 0 or 0.0 on empty-string values. I've included a spec and an example file. The file from the wild contains sensitive data, but I reproduced the empty-cell XML by hand-editing an export from Google Sheets.

I'm not 100% sure this is the right behavior, and I'm open to alternatives. Maybe we could check in SheetDoc and return an Empty cell type instead? Or add an option for "throw exception" vs "default to zero?" But it feels like since Numbers and Sheets can open these files, Roo probably shouldn't throw an exception here.

I found a spreadsheet in the wild with a bunch of empty-value cells.

```
<c r="K58" s="2" t="s">
  <v>65</v>
</c>
<c r="L58">
  <v/>
</c>
<c r="M58" t="s">
  <v>158</v>
</c>
```

Because there was no `format` specified for the cell, it was not an empty cell, and there was no value, the default behavior was to treat the cell as a `number` type and try to parse the value (which was converted to empty-string). Using the kernel-level method `Integer()` , this threw an exception. The exception was also surprisingly hard to trace; it didn't have a full stacktrace in the wild and didn't appear in `$!` in IRB.

Apple's "Numbers" program converts the cell to zero; Google Sheets leaves it blank. I'm not sure what Excel itself does with this kind of XML. This sheet was [produced by KendoUI](https://www.telerik.com/kendo-angular-ui/components/excel-export/), some angular framework with tables, so it's probably not very common.

This change adds a nil check to the typecast, and returns `0` or `0.0` on empty-string values. I'm not 100% sure this is the right behavior, and I'm open to alternatives. Maybe we could check in `SheetDoc` and return an `Empty` cell type instead? Or add an option for "throw exception" vs "default to zero?"

But it feels like since Numbers and Sheets can open these files, Roo probably shouldn't throw an exception here.
@agius
Copy link
Author

agius commented Apr 12, 2023

cc @patrickkulling ? Not sure who to ask for review on this. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant