Add native score-only confidence CLI for existing structures by taivu1998 · Pull Request #311 · bytedance/Protenix

taivu1998 · 2026-05-10T11:24:07Z

Summary

Adds a native score-only workflow for Protenix so users can evaluate existing PDB/mmCIF coordinates with the model trunk and confidence heads without running diffusion sampling.

This addresses #227.

Motivation

Issue #227 asks whether Protenix can be used in a scoring-only mode, especially for complexes. The existing inference path always samples new coordinates through diffusion, while the confidence head already accepts predicted coordinates. This PR exposes that capability through a narrow, native workflow instead of requiring users to rely on a pinned external fork.

Changes

Adds fixed-coordinate inference support in Protenix.forward() and _main_inference_loop() via score_coordinates.
Adds InferenceRunner.score() to run the trunk, distogram, confidence head, and existing confidence summary code on supplied atom coordinates.
Adds protenix score CLI support for PDB/mmCIF inputs and directories.
Adds structure parsing and coordinate remapping utilities that:
- preserve source chain IDs when possible through generated JSON id fields,
- map source coordinates into Protenix featurized atom order,
- detect duplicate atom keys and missing atoms,
- support configurable missing-atom fallback policies.
Writes score outputs:
- summary.csv,
- <sample>/summary_confidence.json,
- <sample>/chain_id_map.json,
- optional <sample>/full_confidence.json,
- optional <sample>/scored.cif,
- failure records when applicable.
Documents protenix score in the README and training/inference instructions.
Adds focused unit tests for structure file collection, chain ID preservation, chain mapping, and coordinate remapping.

Design Notes

The implementation deliberately keeps the first version small:

It reuses the existing inference preprocessing path for MSA, template, and RNA MSA handling.
It reuses the existing confidence summary implementation instead of adding separate scoring metrics.
It skips diffusion only when fixed coordinates are explicitly provided.
It does not add role-aware MSA caches, target/binder-specific outputs, ipSAE-style metrics, or multi-model scoring in this PR.

Validation

Ran:

git diff --cached --check
uvx ruff check protenix/model/protenix.py runner/inference.py protenix/data/inference/structure_scoring.py runner/scoring.py runner/batch_inference.py tests/test_structure_scoring.py
python3 -m py_compile protenix/model/protenix.py runner/inference.py protenix/data/inference/structure_scoring.py runner/scoring.py runner/batch_inference.py tests/test_structure_scoring.py
uv run --python 3.11 ... python -m pytest tests/test_structure_scoring.py

Results:

Ruff passed.
Python compilation passed.
tests/test_structure_scoring.py passed: 7 tests.
Verified the Click group exposes score --help with LAYERNORM_TYPE=torch.

Local Environment Notes

I did not run a full model-backed protenix score on an example structure locally because this machine is missing the Protenix CCD runtime data file at /Users/vuductai/common/components.cif, which blocks the repo's normal parser path. On this macOS host, CLI import smoke also requires LAYERNORM_TYPE=torch to avoid the existing CUDA fused-layernorm compile path.

Add score-only structure confidence CLI

4db3727

taivu1998 marked this pull request as ready for review May 11, 2026 03:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add native score-only confidence CLI for existing structures#311

Add native score-only confidence CLI for existing structures#311
taivu1998 wants to merge 1 commit into
bytedance:mainfrom
taivu1998:tdv/issue-227-scoring-only

taivu1998 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

taivu1998 commented May 10, 2026

Summary

Motivation

Changes

Design Notes

Validation

Local Environment Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant