Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate runs #35

Open
avnikonenko opened this issue Dec 15, 2024 · 8 comments
Open

Replicate runs #35

avnikonenko opened this issue Dec 15, 2024 · 8 comments

Comments

@avnikonenko
Copy link
Collaborator

          Hi, is there a way to make triplicate runs of the simulation via command line using streamd, or do we have to repeat the whole process thrice to run the triplicates of the same complex? Also, do you plan to include a feature for analysis of replicate trajectories? Thanks.

Originally posted by @Gsk468 in #32 (comment)

@avnikonenko
Copy link
Collaborator Author

avnikonenko commented Dec 15, 2024

Hello!
@Gsk468, I have started a new issue to help people with the same question find it more easily.
Yes, you need to run each run by separate command.
It can be done by -d argument, so each run would be saved into a new directory. But you can also use all already prepared files in different runs to save time.
First, run run_md -l ligand.sdf -p protein.pdb --steps 1 -d mdrun1 to run only preparation stage.
Then, copy mdrun1 directory to mdrun2 and mdrun3 (so you would have all prepared files in each separate working directory). After this start each run separately run_md -l ligand.sdf -p protein.pdb -d mdrun1, run_md -l ligand.sdf -p protein.pdb -d mdrun2, run_md -l ligand.sdf -p protein.pdb -d mdrun3. Also you can use --seed argument (-1 for a random seed or set up a specific different/the same values. By default, StreaMD uses -1) if you want to run simulations with different/the same starting velocities.
Then, If you want to get the overall trajectory convergence analysis file for all runs (rmsd_mean_std_time-ranges.html), first reinstall StreaMD again to the newest github version (StreaMD version >= 0.2.9) :), then run run_md --wdir_to_continue mdrun*/md_files/md_run/* --steps 4, it will redo analysis step and create rmsd_mean_std_time-ranges_some-time.html file where all points will have input_directory information in the interactive table.

We haven't planned to implement any additional analysis yet. But if you have any ideas what would be useful to implement for replicates analysis or even would like to contribute here, your input would be appreciated!

@Gsk468
Copy link

Gsk468 commented Dec 24, 2024

Hi, thanks for including the request for triplicate analysis. I tried it, and it ran well. However, there is no html file is generated but a csv file is generated for rmsd, where the values of rmsd for backbone, active site and ligand RMSD of each replicate run are given one below the another, making it difficult to plot. Any recommendations? I was wondering if plots for average RMSD, RMSF, Rg and pocket volume with SD can be included for the triplicate runs.

@avnikonenko
Copy link
Collaborator Author

Hello!
Have you run run_md --wdir_to_continue mdrun*/md_files/md_run/* --steps 4 after and also reinstalled the Streamd before? I am surprised that there is no html file in the working directory. This command should return html and csv file for all directories with "directory" column in the out csv, which helps separate the different runs.
And, yes, the html will show the average and SD values for RMSD only. I can add plots for RMSF and Rg SD and average values in the future. But I am not sure about the pocket volume, do you have any suggestion/recommendation on how to perform such an analysis? I took a look quickly and couldn't find so far any options by MDAnalysis, only by using external tools. Not sure also if I have time for it now. But as a future direction, it would be good to have it.

@Gsk468
Copy link

Gsk468 commented Dec 25, 2024

Yes, I followed the same command run_md --wdir_to_continue mdrun*/md_files/md_run/* --steps 4 and got only rmsd_all_systems_24-12-2024-11-38-26.csv in the running folder and csv and png files in the replicates folders but not the html file. As for the pocket volume, I recently prepared (using AI tools) a script for tunnel/pore volume of GABA receptor employing MDAnalysis. The script can be accessed here https://drive.google.com/file/d/1-L_iBSdInYPFx6Xg6y9a2-eX3u_Xkd80/view?usp=sharing

@avnikonenko
Copy link
Collaborator Author

can you please attach here the log file (log____24-12-2024-11-38-26.log)?

@Gsk468
Copy link

Gsk468 commented Dec 25, 2024

log____24-12-2024-11-38-26.log
Attached is the file.

@avnikonenko
Copy link
Collaborator Author

avnikonenko commented Dec 25, 2024

Thank you for providing the log!
For some unclear for me reasons, your run wasn't completely finished. There are no errors, but also no final lines that the run was complete. This 2 lines are indicating that the run was finished -
root - INFO: Analysis of MD simulations of 3 complexes were successfully finished Successfully finished complexes have been saved in finished_complexes_24-12-2024-11-38-26.txt file
Also you don't seem to have rmsd_mean_std_time-ranges_24-12-2024-11-38-26.csv (and html as expected), which are created after RMSD average/SD analysis (rmsd_all_systems_24-12-2024-11-38-26.csv file was created before the analysis).
Could I ask you please to rerun the same command and check again if you have html and this 2 final lines in the new log file? If not, to send me again the new log file.
I have just checked on my side the test run, log and output and everything worked as expected.

@avnikonenko
Copy link
Collaborator Author

Special thanks for sharing the script! I will take a look later and see if it would be easy to add here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants