Chain ID in parseMMCIF is corresponding to asym_id or auth_asym_id #1985

hnguyentt · 2024-11-04T17:34:14Z

Hello,

What chain ID does the function parseMMCIF use: asym_id (a unique and sequential identifier of each molecule (starting from letter A) in the model) or auth_asym_id (author provided (or PDB assigned) Chain ID)?

When I observed 1mj1, it seems that this function used asym_id rather than auth_asym_id:

Sequence in fasta file:

>1MJ1_7|Chain H[auth L]|L11 ribosomal protein|Escherichia coli (562)
MAKKVAAQIKLQLPAGKATPAPPVGPALGQHGVNIMEFCKRFNAETADKAGMILPVVITVYEDKSFTFIIKTPPASFLLKKAAGIEKGSSEPKRKIVGKVTRKQIEEIAKTKMPDLNANSLEAAMKIIEGTAKSMGIEVVD

from prody import parseMMCIF

struct = parseMMCIF("1mj1")

struct.select("chain H")
> <Selection: 'chain H' from 1mj1 (133 atoms)>

struct.select("chain L")
> None

However, I'm not sure about it. Could you please confirm?

The text was updated successfully, but these errors were encountered:

jamesmkrieger · 2024-11-04T18:18:54Z

Yes, currently we have chain ID coming from label_asym_id and we are using segment ID/name for the auth_asym_id

    chID = line.split()[fields['label_asym_id']]
    segID = line.split()[fields['auth_asym_id']]

There is also an key word argument unite_chains option to change this behaviour

:arg unite_chains: unite chains with the same segment name (auth_asym_id), making chain ids be 
    auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can 
    cause some problems too. For example, using :meth:`.buildBiomolecules` afterwards requires original 
    chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine.
    Default is *False*
:type unite_chains: bool

For some reason, this help was only added to parseMMCIFStream and not parseMMCIF so I'll change that.

jamesmkrieger mentioned this issue Nov 4, 2024

add _parseMMCIFdoc to parseMMCIF.__doc__ #1986

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chain ID in parseMMCIF is corresponding to asym_id or auth_asym_id #1985

Chain ID in parseMMCIF is corresponding to asym_id or auth_asym_id #1985

hnguyentt commented Nov 4, 2024

jamesmkrieger commented Nov 4, 2024

Chain ID in parseMMCIF is corresponding to asym_id or auth_asym_id #1985

Chain ID in parseMMCIF is corresponding to asym_id or auth_asym_id #1985

Comments

hnguyentt commented Nov 4, 2024

jamesmkrieger commented Nov 4, 2024