Inventory data used and select biomedical datasets #9

k8hertweck · 2020-07-23T15:31:09Z

Data inventory from @lakikowolfe 👍 Note data and characteristics of the data

Class 1

Commute Time Dataset
- Feature engineering and EDA
- No missing data
- Generic dataset with both categorical and numeric data

Commute Time Dataset
- Viz of single variables and relationships, linear regression, mean squared error, random forests

Dummy dataset of 0 and 1 as an example of categorical data
Dummy dataset of two random clouds of points to illustrate decision boundaries
Tennis dataset
- all categorical variables, target variable is yes/no played tennis
Iris dataset
- All numeric variables except for target variable (categorical: species)
Dummy dataset for random forest

Dummy data to show the curse of dimensionality
Iris dataset to show the benefits of PCA
- Pair plot
- PCA
Dummy data to superimpose the first component line over a series of random points
Dummy data and custom code to illustrate eiganvectors
Centered faces dataset: "Eigenfaces"
Dummy dataset of clusters to show K means
Arrests data
- four numeric vars
NCI60 for PCA and hierarchical clustering

k8hertweck · 2021-01-25T17:26:11Z

Streamlining approach relative to R ML course.

Class 1: glaucoma data

Class 2: glaucoma data

Class 3: genomic data?

Class 4: genomic data?

k8hertweck self-assigned this Jul 23, 2020