Text to Columns: Fix dtype for empty arrays #275
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
Fixes #274.
The speed up in a74ea06 fails if some variable doesn't appear in any row. The reason is that the table of row indices that contain this value will be empty and empty arrays have a
dtype
float
.In #274, this happened because tables are converted in batches of 5000 rows, and there were batches that didn't contain some values.
In another scenario, this would happen in the domain conversion would be applied on a data different from the one on which it was created.
Description of changes
Add
dtype=int
.Includes