CSVDataset behaves unexpectedly if src is a dataframe unexpected index

**Describe the bug**
CSVDataset accepts pandas DataFrames as input for src. But it makes assumptions about the index.

This is because `convert_tables_to_dicts` uses `.loc` instead of `.iloc`. It generates ordinal indexes to subset on but treats them as names indices.

https://github.com/Project-MONAI/MONAI/blob/0bb20a88ec7869f6453aa58890df50ad6b2b6271/monai/data/utils.py#L1494

**To Reproduce**
```
import numpy
import pandas
import monai

df = pandas.DataFrame(numpy.random.random((50, 3)))
df_subset = df.iloc[numpy.arange(0, 50, 5)]
print(df_subset.shape)  # (10, 3)

ds = monai.data.CSVDataset(df_subset)
print(len(ds))  # 3
```

**Expected behavior**
`print(len(ds))` should return 10.
It returns 3 because it looks up indices slice(10), which match indices 0, 5 and 10 from the subset.

**Environment**
Shouldn't be relevant?

**Additional context**
Simple fix: https://github.com/Project-MONAI/MONAI/blob/0bb20a88ec7869f6453aa58890df50ad6b2b6271/monai/data/utils.py#L1494

The first .loc should be .iloc, and the second should be .iloc[rows][col_names]


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CSVDataset behaves unexpectedly if src is a dataframe unexpected index #8201

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CSVDataset behaves unexpectedly if src is a dataframe unexpected index #8201

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions