You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
missing_values_summary seems to return only count and percentage which can be easily a tuple or a Series at most.
updated returns to series and added more edge-case test: 905315ea039574
There are cases of functions that have very obscure names and/or arguments that make it difficult to understand their purpose
updated example in docstring for each function to improve clarity: 2037e7e449c218cf1cca9
One or more functions are only partially understood from reading their documentation. Documentation of the input object is not specific enough for the user to understand what it can be used on
Kindly review and correct typos. EG: pyead provided to users a practical toolkit for data preprocessing and exploration, enabling users to work more effectively with csv datasets in various projects. in the README file
Currently, the code checks if a file is in CSV format but doesn’t handle other potential issues, such as file access permissions or data inconsistencies. Implementing comprehensive error handling would make the package more robust.
If the numerical results from the get_summary_statistics function could be rounded to a consistent number of decimal places, it would make the output more standardized and easier to read.
added default value of 3 to decimal place in get_summary_statistics(): 449c218
Allow users to customize key parameters (like customize how many decimals to keep in the results) through configuration files or command-line arguments. This enhances the tool’s flexibility for various use cases.
added decimal place as an argument to get_summary_statistics(): 449c218
It would be great if the function results included more explanations about the output. For example, the missing_values_summary function could add a sentence summarizing which column has how many missing values and what percentage that represents, rather than just presenting a simple table. This functionality can also be achieved by adding optional parameters to the function. Please check the comment below for this suggestions.
The summary statistics returned by get_summary_statistics() should probably have fewer significant digits (round to 2 or 3 decimal places instead)
added decimal place argument to get_summary_statistics(): 449c218
There are some minor typos in the README, e.g. "for in_depth research" (should be "in-depth"). It would be a good idea to go through all of the documentation carefully
For review "It would be great if the function results included more explanations about the output. For example, the missing_values_summary function could add a sentence summarizing which column has how many missing values and what percentage that represents, rather than just presenting a simple table. This functionality can also be achieved by adding optional parameters to the function." Thank you for the suggestion! Our goal for missing_values_summary is to provide a structured output that is easy to interpret and use programmatically. We believe that keeping this function concise allows for better integration into data pipelines where a Pandas Series format is preferable. If there's a need for a more detailed explanation, we can consider adding a separate function for that, but we want to keep missing_values_summary simple and focused.
Feedbacks from Milestone 1
missing_values_summary
seems to return only count and percentage which can be easily a tuple or a Series at most.get_summary_statistics
: cc2dafcmissing_values_summary
is missing majority of what it does. Examples show very little to what to expectmissing_values_summary
function: 9187c8d, cf1cca9Feedback from Milestone 2
get_summary_statistics
: 5c05645 d932bfbcheck_csv
: b7d40b8 0ca5ad8check_missing_value_summary
: a039574Feedback from Peer Review - Forgive Agbesi
Feedback from Peer Review - Zhiwei Zhang
get_summary_statistics
function could be rounded to a consistent number of decimal places, it would make the output more standardized and easier to read.get_summary_statistics()
: 449c218get_summary_statistics()
: 449c218Feedback from Peer Review - Derek Rodgers
get_summary_statistics()
should probably have fewer significant digits (round to 2 or 3 decimal places instead)get_summary_statistics()
: 449c218The text was updated successfully, but these errors were encountered: