-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tm/synthetics #22
Draft
TimOliverMaier
wants to merge
42
commits into
main
Choose a base branch
from
tm/synthetics
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Tm/synthetics #22
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+ This adds a new class `ProteomicsExperimentSample` to represent sample being pushed through proteomics setup
1. `ProteomicsExperimentSampleSlice` is considered the working bulk of data, that is loaded as dataframe for processing 2. `ProteomicsExperimentDatabaseHandle` is a wrapper class for sql database management
1. python/proteolizardalgo/feature.py: + Class for charge distribution `ChargeProfile` 2. python/proteolizardalgo/hardware_models.py: + `LiquidChromatography` * support for `irt_to_rt` method * methods returning time interval (start,end) and center of frames + implemented `EMGChromatographyProfileModel` + implemented `NormalIonMobilityProfileModel` 3. python/proteolizardalgo/proteome.py + method to make columns with `Profile` data types SQL compatible TODO: + IonMobilityModel must support multiple charge states. + realistic parameter sampling + realistic `irt_to_rt` method -> must be provided by user
1. model params were null in sql table + This was due to np datatypes (not serializable) + now stored as python built-ins 2. charge profile is stored in peptides table
1. changed hyphen - to underscore _ in sql columns 2. In experiment.py, added structure for TOF spectra assembly 3. replaced assertion with in averagine_generator concerning proper masses for averagine model
This adds a prototype end to end synthetics generator that returns a dictionary (frames) of dictionaries (scans) of `MzSpectrum`
orienting synthetics workflow on experiment
1. In chemistry.py new class `ChemicalCompound` with subclass `BufferGas` for handling of e.g. ion mobility gas properties. + ChemicalCompound gets elemental properties from new dependency ['mendeleev'](https://github.com/lmmentel/mendeleev) 2. CCS to ion mobility / reduced ion mobility is now handled within device class `IonMobilitySeparation` 3. Scan to ion mobility and reverse are now relying on converters defined by user
1. check for empty spectra 2. use __repr__ for file output instead of json
1. instead of repeatedly adding spectra `.push` just copies data and lets `to_resolution` at the end do the sorting and adding.
1. tf models use a lot of RAM, and a variety of approaches to free the RAM did not work. Running the model inference inside a child process worked.
1. fixed bug in which sequence tokens were read as string
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This Pull request implements a platform for synthetic data generation in proteolizard.