Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring scenario + refinement scenario + alascan page #140

Merged
merged 40 commits into from
Oct 2, 2024
Merged

Conversation

sverhoeven
Copy link
Member

@sverhoeven sverhoeven commented Aug 26, 2024

Fixes #138
Fixes #139
Fixes #143

To test upload https://github.com/haddocking/haddock3/blob/main/examples/scoring/data/T161-rescoring-ens.pdb

TODO

  • latest haddock3 is used, with no 'The following error occurred 1' errors
  • alascan report, it generates plot per cluster, should have page that shows all clusters
  • show alascan score in 3D?
  • show score in 3D
  • show scores of molecules without clustering? look at 2_caprieval module output
  • allow user to adjust clustering?
  • include caprieval score without clustering?
  • scoring page
  • refinement page
  • handle non-easy parameters in workflow.cfg, see comment below

Scoring Scenario page
localhost_3000_scenarios_scoring

Scoring Scenario result
localhost_3000_jobs

Refinement scenario page
0 0 0 0_3000_jobs_9_browse (2)

Refinement scenario result
0 0 0 0_3000_jobs_9_browse

Alascan page
0 0 0 0_3000_jobs_5_analysis_alascan_8_1

@sverhoeven sverhoeven changed the title Scoring scenarion Scoring scenario Aug 26, 2024
@sverhoeven
Copy link
Member Author

@VGPReys could you have a look at the screenshots?

And have a look at the constructed workflow at

# ===================================================================================
# CAPRI Scoring example
# ===================================================================================
# The Critical Assessment of PRedicted Interactions (CAPRI) experiment
# aims to do test methods that model macromolecular interactions in
# blind predictions based on the three-dimensional structures of proteins.
# For more information, please visit: https://www.ebi.ac.uk/pdbe/complex-pred/capri/
# ===================================================================================
run_dir = "${JOB_OUTPUT_DIR}"
# molecules to be scored (an ensemble PBD)
molecules = ${molecules}
# ===================================================================================
[topoaa]
[emscoring]
[clustfcc]
min_population = 2
[seletopclusts]
top_cluster = 1
top_models = 2
[mdscoring]
[clustfcc]
min_population = 2
[seletopclusts]
[caprieval]
[alascan]
# ===================================================================================
`;

@sverhoeven sverhoeven marked this pull request as ready for review August 27, 2024 11:20
@sverhoeven
Copy link
Member Author

@VGPReys

  1. How could I include scores of unclustered molecules, add caprieval before clustfcc?
  2. Should an user be allowed to set clustering parameters? if so which ones and how much?

@VGPReys
Copy link

VGPReys commented Aug 27, 2024

@sverhoeven
For the workflow, I would rather go for something like:

# =================================================================================== 
 # CAPRI Scoring example 
 # =================================================================================== 
 # The Critical Assessment of PRedicted Interactions (CAPRI) experiment 
 #  aims to do test methods that model macromolecular interactions in 
 #  blind predictions based on the three-dimensional structures of proteins. 
 # For more information, please visit: https://www.ebi.ac.uk/pdbe/complex-pred/capri/ 
 # =================================================================================== 
 run_dir = "${JOB_OUTPUT_DIR}" 
  
 # molecules to be scored (an ensemble PBD) 
 molecules = ${molecules} 
  
 # =================================================================================== 
 [topoaa] 
  tolerance = 10

 [emscoring] 
  tolerance = 10
 [caprieval] 

 [clustfcc] 
 min_population = 1
 plot_matrix = true 
  
 [seletopclusts] 
 top_models = 4
  
 [alascan]

 [caprieval] 
  
 # =================================================================================== 
  • removing [mdscoring] as it is too costly.
  • adding a caprieval after emscoring (related to your point 1)
  • removing top_cluster = 1, so we retrieve all clusters.
  • adding tolerance to make sure the workflow will terminate successfully.
2. Should an user be allowed to set clustering parameters? if so which ones and how much?

Yes, a user could be able to tune clustering parameters.
I would make visible:min_population and clust_cutoff.
Maybe also top_models and top_cluster from [seletopclusts].

@amjjbonvin
Copy link

amjjbonvin commented Aug 27, 2024 via email

@VGPReys
Copy link

VGPReys commented Aug 27, 2024

I was thinking that I could bring interesting insights on what are the key components for the interaction.
Not required.
Also [contactmap] could be added, for even higher degree of analysis performed on the scoring set.
You have the last words !

@VGPReys
Copy link

VGPReys commented Aug 27, 2024

@sverhoeven
Also, this scenario could handle multiple input files (up to 20).

@VGPReys
Copy link

VGPReys commented Aug 27, 2024

ok @sverhoeven , after some discussion with Alex, we decided to:

  • remove both [alascan] and [contactmap] from the scoring scenario, as it may generate too many plots and take too much time (we could expect user providing thousands of models)

But those two modules will be part of the refinement scenario.

@sverhoeven
Copy link
Member Author

ok @sverhoeven , after some discussion with Alex, we decided to:

  • remove both [alascan] and [contactmap] from the scoring scenario, as it may generate too many plots and take too much time (we could expect user providing thousands of models)

But those two modules will be part of the refinement scenario.

OK, so refinement is another scenario, which is similar to https://github.com/haddocking/haddock3/blob/main/examples/refine-complex/refine-complex-test.cfg . I will create issue for it.

@sverhoeven
Copy link
Member Author

Ah yes without mdscoring it takes 18 seconds for https://github.com/haddocking/haddock3/blob/main/examples/scoring/data/T161-rescoring-ens.pdb

@sverhoeven sverhoeven mentioned this pull request Aug 27, 2024
@sverhoeven sverhoeven changed the title Scoring scenario Scoring scenario + alascan page Aug 28, 2024
… and type with same name, but different shape
@sverhoeven
Copy link
Member Author

For now picked 1 for what and 3 for where, but this gives users with easy level worst results.

@amjjbonvin
Copy link

I would go for 2 and 1 :-)

@sverhoeven sverhoeven changed the title Scoring scenario + alascan page Scoring scenario + refinement scenario + alascan page Sep 6, 2024
@sverhoeven
Copy link
Member Author

Some other PRs where merged into this branch, please review by focusing on functionality or this pr not the other prs.

@sverhoeven
Copy link
Member Author

Sorry review is taking to long, merging it, review can be done on merge pr if needed.

@sverhoeven sverhoeven merged commit d3dfc7b into main Oct 2, 2024
5 checks passed
@sverhoeven sverhoeven deleted the scoring branch October 2, 2024 08:27
Copy link

github-actions bot commented Oct 2, 2024

Please delete the images belonging to this Pull Request as they are no longer useful.

Goto versions page of each image, find version called pr-140, filter on tagged versions by clicking "N tagged" button in table header, click on ... button and select Delete version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refine scenario link to alascan html files Scoring scenario
3 participants