Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space group missing in _symmetry.space_group_name_H-M #58

Open
drlemmus opened this issue May 15, 2024 · 5 comments
Open

Space group missing in _symmetry.space_group_name_H-M #58

drlemmus opened this issue May 15, 2024 · 5 comments

Comments

@drlemmus
Copy link

The controlled vocabulary for _symmetry.space_group_name_H-M is missing the value 'C 1 2/c 1' which is the space group for PDB entry 5jzq. Could you add it?

@CV-GPhL
Copy link
Contributor

CV-GPhL commented May 15, 2024

Could we maybe take that opportunity for some further cleanup of that enumerated list? I was thinking of the following ...

Remove some odd (?) SGs that are not even used within PDB archive (looking at derived_data/index/crystal.idx):

A 1
B 2 21 2
C 2
C 2(A 112)
C 21
I 21
P 2
P 21
P 21(C)

(other slightly odd ones but with actual PDB entries are C 4 21 2, F 4 2 2 and P 21 21 2 A.

Should this maybe be a complete list of H-M symbols for all SGs - instead of just non-chiral/proteni ones plus a selected subset of chiral ones added as-needed?

@wojdyr
Copy link
Contributor

wojdyr commented May 15, 2024

The only one that bothers me is C 4 21 2. It's used in two files with such a remark:

REMARK   3  SPACE GROUP C 4 21 2 (WHICH, MORE PROPERLY, SHOULD BE NAMED         
REMARK   3   C 4 2 21) IS A NON-STANDARD REPRESENTATION OF SPACE GROUP          
REMARK   3   P 4 21 2.  IN THIS CASE THE AXES OF THE UNIT CELL ARE              
REMARK   3   CONSIDERED TO BE LEFT-HANDED. 

But I'm not sure if C 4 2 21 would be a matching name for the symops.

Other names used in the PDB entries are

  • either clearly custom names (such as P 21 21 2 A) listed in symop.lib – the added letter marks the shift of origin,
  • or, like F 4 2 2 and A 1, are reasonable names listed in various places (here, here and here).

@githubgphl
Copy link

Will anyone ever nowadays start using "A 1" or "F 4 2 2" as a SG name? Or: should they? A lot of those were originally triggered by restrictions in old, non-general software (I remember a specific phasing program in my old lab that couldn't handle non-orthogonal SGs, resulting in the search for new crystallisation conditions that avoided those SGs).

Even if something like "F 4 2 2" is possible and used in 2 (!) PDB entries from one (1) publication in 1987 (maybe to show a specific relation to other structures ...), the standard setting is "I 4 2 2". I think the "_symmetry.space_group_name_H-M" enumeration should stick with standard settings plus truly common alternate settings that are used in more than a handful of old PDB entries. E.g. "A 1" is neither standard nor used in any PDB entry and should not be in that list I think.

@drlemmus
Copy link
Author

We can only drop cases not in the PDB at all. Anything, in the PDB, however rare should be kept. That said, perhaps some weird settings can be reset if that gets rid of some weird cases.

@githubgphl
Copy link

We can only drop cases not in the PDB at all. Anything, in the PDB, however rare should be kept.

Agree.

My suggestions would be

  • not just add that one (new) occurrence ('C 1 2/c 1') but maybe already at least other "standard" SGs that might come up in the future: this way we don't have to add to that list in a rather pedestrian way whenever a new entry appears in the PDB (out of interest: how can those get through validation in the first place if the SG is not yet in that enumeration?)
  • remove those "non-standard" ones that are not used in the current PDB archive, i.e. the list above from 'A 1' to 'P 21(C)'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants