data_preprocess_helper

Step through each feature to determine what preprocessing should be done. Then automatically create and fit a pipeline. Processing pipeline is based on sklearn and includes transformers from the tubular package

TO-DO:

~~simplify class to not retain a copy of the dataframe~~ Done 5/12/2023. Now user has to pass in dataframe as a parameter (makes saving the preprocessor in a pickle file less bloated)
~~allow for user to specify capping~~ Updated 5/12/2023. User can go through each specific feature to see what different capping values, or select quantiles for all
update response variable to be used for categorical encoding (one-hot, ordinal encoding, etc.)
Create a version that only uses sklearn preprocessing methods (in case tubular is unavailable)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Preprocess_Helper_Example.ipynb		Preprocess_Helper_Example.ipynb
README.md		README.md
preprocess_help.py		preprocess_help.py
preprocess_help_DEV.py		preprocess_help_DEV.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data_preprocess_helper

About

Releases

Packages

Languages

seth602/data_preprocess_helper

Folders and files

Latest commit

History

Repository files navigation

data_preprocess_helper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages