Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-computed files need to be regenerated for each set of parameters #16

Open
shervinea opened this issue Sep 10, 2021 · 0 comments
Open

Comments

@shervinea
Copy link
Owner

Context. Real-time PDB parsing with the BioPython package, e.g. typically:

self.structure = PDBParser().get_structure(pdb_id.upper(), fullfilename)
is expensive and bottlenecks the training process if done on the fly.

For this reason, we put in place a "precomputation stage"

def check_precomputed(self) -> None:
that takes all enzymes beforehand and stores target volumes in a dedicated folder.

Current limitation. This process is repeated for each set of parameters {weights considered, interpolation level between atoms p, volume size}. This is ineffective from the perspectives of:

  • total computations performed: PDB parsing is the same for all these configurations and needs to be identically repeated for each of them. The only remaining operations are relatively cheap: e.g. 2D -> 3D mapping, points interpolation. With a proper implementation, these last steps can easily be done on the fly without becoming a bottleneck.
  • space: the number/size of produced files increases with the same pace as the number of configurations that the user tries out (!).

Desired behavior. Coordinates + weights precomputation from PDB files is done only once and produces a parsed version of the data that is:

  1. Light enough so that it can be transformed to target volumes on the fly
  2. Complete enough so that all configurations' data can be derived from them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant