Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hbond update #3848

Merged
merged 22 commits into from
Dec 7, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ examples/output.txt
# ignore Vagrant virtual machines
.vagrant
# ignore coverage files
.coverage
.coverage*
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
.noseids
htmlcov
# ignore trajectory offset caches
Expand Down
3 changes: 3 additions & 0 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ The rules for this file:
* 2.4.0

Fixes
* Update hbond analysis doc string to use exclusive bound language (#3847)
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
* Added isolayer selection method (Issue #3845)
* Auxiliary; determination of representative frames: Removed undefined
behaviour for cutoff values < -1 (PR # 3749)
Expand All @@ -31,6 +32,8 @@ Fixes


Enhancements
* Added ability for hbond anaylsis to use types when resnames aren't there
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
* Added explanatory warnings when hbond analysis doesn't find any hbonds
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
* AuxReaders now have a memory_limit parameter to control when memory usage
warnings are issued. (PR # 3749)
* MDAnalysis.units now has a lookup dictionary called MDANALYSIS_BASE_UNITS
Expand Down
145 changes: 116 additions & 29 deletions package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,17 @@
protein-water and protein-protein hydrogen bonds will be found, but
no water-water hydrogen bonds.

One can now also define hydrogen bonds with atom types::
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved

import MDAnalysis
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
from MDAnalysis.analysis.hydrogenbonds.hbond_analysis import HydrogenBondAnalysis as HBA
hbonds = HBA(
universe=u,
donors_sel='type 2',
hydrogens_sel='type 1',
acceptors_sel='type 2',
)

In order to compute the hydrogen bond lifetime, after finding hydrogen bonds
one can use the :attr:`lifetime` function::

Expand Down Expand Up @@ -236,6 +247,7 @@
from MDAnalysis.lib.correlations import autocorrelation, correct_intermittency
from MDAnalysis.exceptions import NoDataError
from MDAnalysis.core.groups import AtomGroup
from MDAnalysis.analysis.hydrogenbonds.hbond_autocorrel import find_hydrogen_donors

from ...due import due, Doi

Expand All @@ -260,6 +272,9 @@ def __init__(self, universe,
"""Set up atom selections and geometric criteria for finding hydrogen
bonds in a Universe.

Hydrogen bond selections may be achieved with either a resname, atom
name combination, or when those are absent, atom types.

jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
Parameters
----------
universe : Universe
Expand Down Expand Up @@ -310,6 +325,8 @@ def __init__(self, universe,
information is used, as this is the only way that guarantees the
correct identification of donor-hydrogen pairs.

.. versionchanged:: 2.4.0
Added use of atom types
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
.. versionadded:: 2.0.0
Added `between` keyword
"""
Expand Down Expand Up @@ -367,14 +384,19 @@ def guess_hydrogens(self,
):
"""Guesses which hydrogen atoms should be used in the analysis.

Hydrogen selections may be achieved with either a resname, atom
name combination, or when those are absent, atom types.

jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
Parameters
----------
select: str (optional)
Selection string for atom group from which hydrogens will be identified.
max_mass: float (optional)
Maximum allowed mass of a hydrogen atom.
The mass of a hydrogen atom must be less than this value.
min_mass: float (optional)
The mass of a hydrogen atom must be greater than this value.
min_charge: float (optional)
Minimum allowed charge of a hydrogen atom.
The charge of a hydrogen atom must be greater than this value.

Returns
-------
Expand Down Expand Up @@ -407,24 +429,35 @@ def guess_hydrogens(self,
))
]

hydrogens_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(hydrogens_ag.resnames, hydrogens_ag.names)
]
)
if hasattr(hydrogens_ag,"resnames") and hasattr(hydrogens_ag,"names"):
hydrogens_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(hydrogens_ag.resnames, hydrogens_ag.names)
]
)
else:
hydrogens_list = np.unique(
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
[
'type {}'.format(tp) for tp in hydrogens_ag.types
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only quickly going through this so I might be missing something obvious - doesn't this assume that all types are the same? Wouldn't this be a problem if you end up with a type where some of the atoms are outside the bounds of the allowable charges but not others?

My recommendation here would be to default to a unique attribute, like the ix value, instead of using something that's broad and could pick up stuff outside the intended atomgroup selection.

Copy link
Contributor Author

@jaclark5 jaclark5 Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IAlibay In most cases an atom type in LAMMPS is all identical with the exception of forcefields such as ReaxFF and COMB. Although you are right that LAMMPS allows for atoms to have individual charges so this could be an issue as forcefields develop...

Here's the issue, these structures look for a list of atom selection strings that represent categories, so if this was on an index basis, the list would be very long, and the resulting output would be useless (representing a single interaction). One could have this structure instead search the atoms of each type and make a long string that specifies the specific indices that meet the charge criteria. However, this string will be unnecessary computational time in 99% of cases, and in the 1% I would argue it would skew the results as these many-body forcefields will continue to fluctuate in charge throughout the simulation, as they are supposed to. I would argue counting "low" or "high" charges in that case wouldn't disrupt the intention of the calculation, it would result in a broken hydrogen bond with a more "realistic" decay time due to local atom environments.

]
)


return " or ".join(hydrogens_list)

def guess_donors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered donors in the analysis. Only use if the universe topology does not
contain bonding information, otherwise donor-hydrogen pairs may be incorrectly assigned.

Donor selections may be achieved with either a resname, atom
name combination, or when those are absent, atom types.
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
select: str (optional)
Selection string for atom group from which donors will be identified.
max_charge: float (optional)
Maximum allowed charge of a donor atom.
The charge of a donor atom must be less than this value.

Returns
-------
Expand Down Expand Up @@ -455,31 +488,48 @@ def guess_donors(self, select='all', max_charge=-0.5):
hydrogens_sel = self.hydrogens_sel
hydrogens_ag = self.u.select_atoms(hydrogens_sel)

ag = hydrogens_ag.residues.atoms.select_atoms(
"({donors_sel}) and around {d_h_cutoff} {hydrogens_sel}".format(
donors_sel=select,
d_h_cutoff=self.d_h_cutoff,
hydrogens_sel=hydrogens_sel
# We're using u._topology.bonds rather than u.bonds as it is a million times faster to access.
# This is because u.bonds also calculates properties of each bond (e.g bond length).
# See https://github.com/MDAnalysis/mdanalysis/issues/2396#issuecomment-596251787
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
if (hasattr(self.u._topology, 'bonds') and len(self.u._topology.bonds.values) != 0):
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
donors_ag = find_hydrogen_donors(hydrogens_ag)
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
else:
ag = hydrogens_ag.residues.atoms.select_atoms(
"({donors_sel}) and around {d_h_cutoff} {hydrogens_sel}".format(
donors_sel=select,
d_h_cutoff=self.d_h_cutoff,
hydrogens_sel=hydrogens_sel
)
)
donors_ag = ag[ag.charges < max_charge]

if hasattr(donors_ag,"resnames") and hasattr(donors_ag,"names"):
donors_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(donors_ag.resnames, donors_ag.names)
]
)
else:
donors_list = np.unique(
[
'type {}'.format(tp) for tp in donors_ag.types if tp not in hydrogens_ag.types
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
]
)
)
donors_ag = ag[ag.charges < max_charge]
donors_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(donors_ag.resnames, donors_ag.names)
]
)

return " or ".join(donors_list)

def guess_acceptors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered acceptors in the analysis.

Acceptor selections may be achieved with either a resname, atom
name combination, or when those are absent, atom types.

jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
Parameters
----------
select: str (optional)
Selection string for atom group from which acceptors will be identified.
max_charge: float (optional)
Maximum allowed charge of an acceptor atom.
The charge of an acceptor atom must be less than this value.

Returns
-------
Expand All @@ -503,11 +553,18 @@ def guess_acceptors(self, select='all', max_charge=-0.5):

ag = self.u.select_atoms(select)
acceptors_ag = ag[ag.charges < max_charge]
acceptors_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(acceptors_ag.resnames, acceptors_ag.names)
]
)
if hasattr(acceptors_ag,"resnames") and hasattr(acceptors_ag,"names"):
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
acceptors_list = np.unique(
[
'(resname {} and name {})'.format(r, p) for r, p in zip(acceptors_ag.resnames, acceptors_ag.names)
]
)
else:
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should have the option to choose whether or not to default to types here, however for the sake of getting this done I'm going to let it go as-is here, and raise a separate issue to deal with later.

acceptors_list = np.unique(
[
'type {}'.format(tp) for tp in acceptors_ag.types
]
)

return " or ".join(acceptors_list)

Expand Down Expand Up @@ -540,7 +597,12 @@ def _get_dh_pairs(self):
# Otherwise, use d_h_cutoff as a cutoff distance
else:

hydrogens = self.u.select_atoms(self.hydrogens_sel)
if self.hydrogens_sel is None:
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
hydrogens_sel = self.guess_hydrogens()
else:
hydrogens_sel = self.hydrogens_sel

hydrogens = self.u.select_atoms(hydrogens_sel)
donors = self.u.select_atoms(self.donors_sel)
donors_indices, hydrogen_indices = capped_distance(
donors.positions,
Expand Down Expand Up @@ -622,6 +684,14 @@ def _single_frame(self):
return_distances=True,
)

if np.size(d_a_indices) == 0:
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
warnings.warn(
"No hydrogen bonds were found given d-a cutoff of {} between "\
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
"Donor, {}, and Acceptor, {}.".format(self.d_a_cutoff,
self.donors_sel,
self.acceptors_sel)
)

# Remove D-A pairs more than d_a_cutoff away from one another
tmp_donors = self._donors[d_a_indices.T[0]]
tmp_hydrogens = self._hydrogens[d_a_indices.T[0]]
Expand All @@ -644,6 +714,14 @@ def _single_frame(self):
)
hbond_indices = np.where(d_h_a_angles > self.d_h_a_angle)[0]

if np.size(hbond_indices) == 0:
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
warnings.warn(
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
"No hydrogen bonds were found given angle of {} between "\
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
"Donor, {}, and Acceptor, {}.".format(self.d_h_a_angle,
self.donors_sel,
self.acceptors_sel)
)

# Retrieve atoms, distances and angles of hydrogen bonds
hbond_donors = tmp_donors[hbond_indices]
hbond_hydrogens = tmp_hydrogens[hbond_indices]
Expand Down Expand Up @@ -800,8 +878,17 @@ def count_by_type(self):
d = self.u.atoms[self.hbonds[:, 1].astype(np.intp)]
a = self.u.atoms[self.hbonds[:, 3].astype(np.intp)]

tmp_hbonds = np.array([d.resnames, d.types, a.resnames, a.types],
dtype=str).T
if hasattr(d,"resnames"):
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
d_res = d.resnames
else:
d_res = [None for x in range(len(d.types))]
orbeckst marked this conversation as resolved.
Show resolved Hide resolved

if hasattr(a,"resnames"):
a_res = a.resnames
else:
a_res = [None for x in range(len(a.types))]

tmp_hbonds = np.array([d_res, d.types, a_res, a.types], dtype=str).T
hbond_type, type_counts = np.unique(
tmp_hbonds, axis=0, return_counts=True)
hbond_type_list = []
Expand Down
Loading