Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for load_data to robustly handle typical csv issues. #434

Closed
djinnome opened this issue Dec 18, 2023 · 2 comments
Closed

Add tests for load_data to robustly handle typical csv issues. #434

djinnome opened this issue Dec 18, 2023 · 2 comments
Assignees
Labels

Comments

@djinnome
Copy link
Contributor

djinnome commented Dec 18, 2023

The new interface for calibrate requires csv files, but there are lots of ways that csv files can fail to provide information in the right format. Test the most common failure modes, such as:

  • missing data
  • incorrectly typed columns
  • mislabled columns
  • header columns have one fewer column than data
  • alignment issues
  • Escaping commas
  • Na, NaN, None, '',,

All of these issues will make it difficult to convert a dataframe to a correctly typed tensor.

@djinnome djinnome self-assigned this Dec 18, 2023
@sabinala sabinala self-assigned this Dec 19, 2023
@sabinala
Copy link
Contributor

I'm going to work off of the branch sa-load-data-tests instead of the branch 434-add-tests-... linked to this issue because it looks like something may be off with that branch? It still has a notebook folder with some but not all of the old notebooks from before.

@SamWitty
Copy link
Contributor

Closed, as offloaded by our CSV data loader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants