- Remove all camel casing function names/vars, change to underscores and lowercase.
- Reorder amino acids alphabetically.
- Add normalize to moran and geary auto
- For quasi seq order you should be able to pass in name of distance matrix file with or without .json
- For each descriptor, check valid amino acids in seq, if not then raise custom error.
- Add dimensions of each descriptor to readme and docs.
- Round SOCN to 3 d.p
- Unit test the data type for each column in all descriptors.
- Remove .empty tests from unit tests as validating shape of DF will test for emptiness.
- Add 0 to singualr descriptor columns, e.g polarizability_CTD_C_1 -> polarizability_CTD_C_01
- Change sec_struct to secondary_struct.
- Reduce number of tests by iterating over list of protein seqs.
- Test dtypes of output dataframe -> test_autocorrelation
- Add shape to comment on testing shape unittests.
- Mention lag is similar to gap between 2 amino acids.
- Go through test_quasi file, double checking correct values.
- Append distance matrix to SOCN & Quasi columns, SW or G.
- Change quasi sequence order -> sequence_order.
- Calculate all SOCN, for both matrices, append to single output df.
- SOCN done, quasi done.
- Reread descriptor comments and explanations.
- Change SOCNUm to SOCN.
- Pseudo AAC has to explicitly use hydrophoobicity, hydrophilicity and side-chain/residue mass values. Can't find corresponding values in aaindex so just hard code them in.
- If no properties input to pseudo or amp comp funcs then use hydo, hydrophi, residue by default. Accept list of aaindex1 codes, if str input then cast to list.
- Uppercase sequence on input, remove whitespace.
- Move references to top of each module.
- Input property in CTD funcs can be used with closeness function.
- Double check functions that use aa_composition values, aa_comp func returns series rather than dict.
- Rather than iterate over range of lags, use different lag in each sequence test.
- Change max_lag to lag
- Create demo on Notebook.
- Add descriptor abbreviations to each functiosn comments, change abbreviations of Pseudo AAComp -> PAAComp.
- Add references to readme text.
- In readme, add output of each function below its usage.
- Add reference numbers to comments in descriptor functions - double check existing ones are correct.
- Add lag and weight param validation to sequence order module.
- Change QSOrder to QSO.
- Rewrite APAAComp descriptor comments to mention its dimensions change with lamda.
- For all functions that have lag in them: #raise value error if int cant be parsed from input lag try: lag = int(lag) except: raise ValueError("Invalid lag value input, integer cannot be parsed from {}".format(lag))
- Add logo/image to main readme.
- Add emojis to readme.
- Add releases.
- Change hydrophobicity_CTD_T_13 to CTD_T_13_hydrophobicity.
- Python unit tests using ctd with 1 property, and using all properties, check dimensions - 21 vs 147 (147/21=7). 21 dimensions per property. 3 C, 3 T, 15 D.
- Add output dimensions to SOCN functions.
- def sequence_order_coupling_number() - dimesnion (1,lag). def sequence_order_coupling_number_all() - dimension (1,lag*2)
- def quasi_sequence_order() - dimesnion (1,lag). def quasi_sequence_order_all() - dimension (1,lag*2)
- Fix cirlceci and add circleci badge to readme, double check workflow.
- Add codecov, use pySAR repo as an example.
- Add references to each descriptor comments.
- Change all comment underlining from "------" to "=======".
- In "Parameters' and 'returns' , remove space between colon.
- Read over code.
- Create demo
- Add equations of descriptors to markdown file. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0554-8#Sec10 - https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-8-300
- Remove python 3.7 references, 3.8 minimum.
- In descriptor comments, dimensions of output should be 1 x N rather than N x 1 (N=# of features).
- Remove biopython from requirments & setup.py - required for testing.
- https://github.com/gadsbyfly/PyBioMed/blob/master/PyBioMed/doc/Descriptor/PyBioMed%20Protein.pdf
- Sequence order can accept just schenider-wrede or grantham.
- Add link to medium article.
- readthedocs(https://github.com/MartinThoma/propy3/tree/master).
- https://www.google.com/url?sa=i&url=https%3A%2F%2Fchem.libretexts.org%2FBookshelves%2FOrganic_Chemistry%2FOrganic_Chemistry_%2528OpenStax%2529%2F26%253A_Biomolecules-_Amino_Acids_Peptides_and_Proteins%2F26.10%253A_Protein_Structure&psig=AOvVaw0Qo-k6BzbFLhPNHLlzBkIL&ust=1700267570233000&source=images&cd=vfe&opi=89978449&ved=0CBIQjRxqFwoTCNiBtLbkyYIDFQAAAAAdAAAAABAE
- Change physiochemical to physicochemical.