Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data monitoring and analysis codes #4

Open
gnperdue opened this issue Oct 8, 2019 · 1 comment
Open

Data monitoring and analysis codes #4

gnperdue opened this issue Oct 8, 2019 · 1 comment

Comments

@gnperdue
Copy link
Contributor

gnperdue commented Oct 8, 2019

We need to build a suite of scripts to analyze the training to be sure it is appropriate for training and make sure we understand the inputs. What sort of data would we exclude?

@gnperdue
Copy link
Contributor Author

News from @jasonstjohn

Automated data quality checks are next.  Other than a reasonable file size trend,
we don't have any assurance that we're getting all the data we want, and that it's
free of junk values, duplicates, etc.   I expect to fill in that 'etc' only with some
careful thought and experimentation (mostly plot-making). 

@gnperdue gnperdue added this to the first-on-board-model milestone Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant