Stop treating `list`s as typed arrays #2

LTLA · 2023-11-07T07:43:07Z

We should stop considering lists to be typed arrays, because they're not.

Currently, a list of strings is treated as a typed array of strings. This is difficult as:

Every function needs to scan the list to check that, indeed, the list only contains strings.
Every function also needs to scan the list to check whether the list contains None values to represent missing strings.
It also introduces ambiguity, e.g., is a list of strings to be interpreted as a typed array that can only ever contain strings or as an unstructured list of arbitrary objects? What should we guess [] to be? (This has consequences for singledispatch.)

So I propose that all arrays of strings should now use numpy.array with the string type, which is closer to R's character vectors than Python list. This includes, e.g., the row and column names of the BiocFrame, the levels of the Factor, and so on.

The case is even easier to make for numeric types.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop treating `list`s as typed arrays #2

Stop treating `list`s as typed arrays #2

LTLA commented Nov 7, 2023

Stop treating lists as typed arrays #2

Stop treating lists as typed arrays #2

Comments

LTLA commented Nov 7, 2023

Stop treating `list`s as typed arrays #2

Stop treating `list`s as typed arrays #2