Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chap job crashes due to memory issues #8

Open
eric-jm-lang opened this issue Jun 27, 2019 · 7 comments
Open

Chap job crashes due to memory issues #8

eric-jm-lang opened this issue Jun 27, 2019 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@eric-jm-lang
Copy link

Hi,
I am trying to run chap on a 50000 frames trajectory. However the job dies after about 1000 frames, I believe because it feels up all the RAM available on my computer: I can see all the amount of memory used increases until it uses everything.

I have to run it with -dt 100 to have the calculation end successfully. Which means that I loose a significant portion of my MD.

Is this due to a memory leak or other bug? Or does the program needs to keep everything in memory?
Is there a work around this problem? for example is it possible to force writing to disk instead of keeping in memory?

Many thanks in advance.

@Inniag Inniag self-assigned this Oct 24, 2019
@Inniag Inniag added the enhancement New feature or request label Oct 24, 2019
@Inniag
Copy link
Collaborator

Inniag commented Oct 24, 2019

First of all, I should mentione that analysing frames every 100 ps is likely not a big issue when determining time-averaged quantities over a long trajectory. In fact, you may even want to exclude frames that are only a short time apart in order to decorrelate the data. Still, the high memory demand for long trajectories is a problem I would like to solve, but it turns out to be complicated.

CHAP relies on libgromacs for parsing trajectories, which only reads one frame at a time (this is documented here). Thus, input trajectories are never kept in memory in their entirety. The problem therefore lies with the handling of output data. For this, CHAP uses the analysis data handling module provided by libgromacs. As far as I am aware, this module simply accumulates data over the entire trajectory and serialises it only at the end of the trajectory parsing process. I tried to work around this by manually serialising data after each frame (this is where the temporary output_stream.json file comes from), but I have not yet found a way to flush the AnalysisData container. Any help on this issue would be appreciated.

In terms of workaround, you could run CHAP on individual trajectory chunks (first 10 ns, second 10 ns, etc.) and write a Python script to combine the output data. I would need to know which quantity you are after in order judge how feasible this would be, but in principle, CHAP allows its users access to (nearly) all data produced internally.

@eric-jm-lang
Copy link
Author

I appreciate your point regarding the decorrelation of data, however, in my case I have a region rarelly hydrated so in order to get sufficient data about the hydration, I wanted to try using more than 1 frame every 100 ps.

I understand the problem and I am affraid I wouldn't be able to help... One thing however is that is not clear to me is that the amount of memory use is far superior to the size of the trajectory itself: the memory consumption is over 60 GB after a few thousands frames analysed for a trajectory totalling 7 GB. What could be so large to use so much memory? Could there be a memory leack somewhere?

Indeed, I thought about doing this, but I don't know how to combine the json formated output files. I am performing what I would describe as a standard analysis (radius profile, solvent number density profiles, minimum solvent density, etc.), so I am interested in getting the pathwayProfile, pathwayScalarTimeSeries and pathwayProfileTimeSeries type of data. How easy would it be to combine the json files?

Many thanks!

@channotation
Copy link
Owner

Combining the JSON files should be very straightforward if you have any prior experience in e.g. Python programming (similar scripting languages like R will work as well). A JSON file maps one-to-one to a nested structure of Python lists and dictionaries. All you'd need to do is to lead all CHAP output files (each derived from its own trajectory chunk), extract the relevant data (see the documentation for details), and paste it together. Forming an average should then be straightforward with e.g. numpy.

@eric-jm-lang
Copy link
Author

Thanks. I use Python.
So I am trying to slice up my trajectory, run chap on trajectory chunks and then recombine them.
I am having however a problem: the pathway is not always define in the same way for each chunk ( depending on the first frame I suppose.
Here is an example:
If I have two chunks analysed with CHAP that returned two JSON files. After loading the data (as data1 and data2), here are the results for the "s" array:

np.array(data1["pathwayProfileTimeSeries"]["s"])

returns:
array([-4.34701633, -4.33732796, -4.32763958, ..., 5.31229877, 5.32198715, 5.33167553])

whereas
np.array(data2["pathwayProfileTimeSeries"]["s"])

returns
array([-4.38113976, -4.3716259 , -4.36211157, ..., 5.10431004, 5.11382389, 5.12333775])

How can this discrepency in the patway definition can be solved? Is there a way to tel chap to use e.g. a pdb file to define the pathway? Or to specify a file containing the the array of (unique) values of np.array(data1["pathwayProfileTimeSeries"]["s"]) as an input to CHAP?

Many thanks

@Inniag
Copy link
Collaborator

Inniag commented Oct 30, 2019

TLDR: You need to use interpolation to ensure that the s-coordinate is the same for all trajectory chunks. NumPy already provides a library for this: https://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html

Here's why: CHAP always writes 1000 data points for each profile (reducing this number to something like 10 points with the -out-num-points flag may help you develop your script), spanning the entire range between the two openings of the pore. Since the pore may have slightly different lengths in different frames, this means that the spacing of these points along the centre line (Delta s) is not equal across frames. The alternative would be to always use the same Delta s, but that would mean a different number of points in each frame, which would make post-processing of the data even more complex (as it would lead to different array dimensions from a Python point of view).

One more comment: There is no need to create trajectory chunks. CHAP can selectively analyse only a specific time range using the flags -b and -e (both in picoseconds). That way you don't need to create trajectory chunk files that might be quite storage intensive.

@eric-jm-lang
Copy link
Author

Thanks for the suggestion and for the tips to specify a specific time range. I am afraid, however,that I fail to understand how to use numpy.interp in this case. Have you used such kind of script for this purpose before? Do you have any examples?
Alternatively, I was thinking of adding at the beginning of each chunk of the trajectory the same first frame so the profile should always be the same and then removing those frames based on their indexes in the numpy arrays... It sounds convoluted, but at this stage it looks to me more straightforward than using interpolation.

@eric-jm-lang
Copy link
Author

Also, based on the amount of RAM currently required to process my trajectories every 100 ps (this amount seems to increase linearly with the number of frames. I expect that if I wanted to process my full trajectory of 19 GB (corresponding to a 1.5 us trajectory with a frames saved every 10 ps), I would require more than 1TB of RAM!
I am pretty sure that I have used programs that rely on libgromacs before, but I have never encountered one that requires such a large amount of RAM. Do you think that something during the analysis is saved in memory and this is not cleared after being used? Or something is saved multiple times in memory?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants