support NvNMD-train and NvNMD-explore #298

jiongwalai · 2025-06-11T13:19:51Z

support NvNMD-train and NvNMD-explore

Summary by CodeRabbit

New Features
- Added support for NVNMD training and exploration workflows, including new operation types, argument variants, and configuration options.
- Introduced new task group types and LAMMPS input modifications to enable NVNMD rerun logic and model handling.
- Enhanced model deviation calculations and output merging for NVNMD exploration runs.
- Expanded artifact handling and argument schemas to accommodate NVNMD tasks.
Bug Fixes
- Improved artifact access logic to prevent errors when artifacts are missing.
Tests
- Added comprehensive unit and integration tests for NVNMD training and running operations, including new test utilities and mock classes.
Chores
- Updated constants and imports to include NVNMD-related entities and naming conventions.

support nvnmd train and inference

for more information, see https://pre-commit.ci

coderabbitai · 2025-06-11T13:24:06Z

📝 Walkthrough

"""

Walkthrough

This update introduces comprehensive support for a new "nvnmd" (Non-Variational Neural Molecular Dynamics) training and exploration workflow in the DP-GEN2 framework. It adds new operator classes, argument schemas, task group types, LAMMPS input generation logic, and workflow integration for "nvnmd" variants. Extensive unit and integration tests for these new features are also included.

Changes

File(s) / Group	Change Summary
dpgen2/constants.py	Added constants for CNN/QNN training script names and NVNMD model filename pattern.
dpgen2/entrypoint/args.py	Added `nvnmd_train_args()` function; registered "dp-nvnmd" and "lmp-nvnmd" variants for training and exploration.
dpgen2/entrypoint/submit.py	Integrated "dp-nvnmd" and "lmp-nvnmd" into concurrent learning op creation, scheduler, and workflow validation.
dpgen2/exploration/render/traj_render_lammps.py	Refactored trajectory reading to use ASE for per-frame conversion to `dpdata.System`.
dpgen2/exploration/task/lmp/lmp_input.py	Added `nvnmd_version` parameter; conditional logic for NVNMD LAMMPS input script generation and rerun control flow.
dpgen2/exploration/task/lmp_template_task_group.py	Added `nvnmd_version` support to template task group, input model/dump revision, and rerun logic.
dpgen2/exploration/task/make_task_group_from_config.py	Added "lmp-nvnmd" and "lmp-nvnmd-template" types, with config handling for NVNMD versioning.
dpgen2/exploration/task/npt_task_group.py	Extended `set_md` method to accept and propagate `nvnmd_version`.
dpgen2/op/init.py	Imported `RunNvNMD` and `RunNvNMDTrain` classes for NVNMD operations.
dpgen2/op/run_nvnmd.py	New file: Implements `RunNvNMD` operator for NVNMD LAMMPS tasks, including model deviation calculation and output merging.
dpgen2/op/run_nvnmd_train.py	New file: Implements `RunNvNMDTrain` operator for NVNMD DeePMD-kit training, handling CNN/QNN stages, config normalization, and artifact management.
dpgen2/utils/download_dpgen2_artifacts.py	Uses `.get()` for artifact dictionary access to avoid `KeyError`.
dpgen2/op/prep_dp_train.py	`_script_rand_seed` now handles "nvnmd" key for random seed assignment.
dpgen2/op/run_dp_train.py	`write_other_to_input_script` uses deep copy, adds `"save_ckpt"`, and supports quantized mode with zero training steps.
dpgen2/superop/prep_run_dp_train.py	Prep class and function signatures generalized to support `RunNvNMDTrain` in addition to `RunDPTrain`.
tests/mocked_ops.py	Added mocks for NVNMD model initialization, training, and run operations, including checkpoint handling and artifact validation.
tests/op/test_run_nvnmd.py	New file: Unit tests for `RunNvNMD` operator and model-setting logic.
tests/op/test_run_nvnmd_train.py	New file: Unit tests for `RunNvNMDTrain` operator, config normalization, and input script generation.
tests/op/test_prep_dp_train.py	Added NVNMD template script tests for `PrepDPTrain`.
tests/op/test_run_dp_train.py	Updated expected training config to include `"save_ckpt"` in `"training"`.
tests/test_prep_run_dp_train.py	Added test classes and helpers for mocked NVNMD training, including integration with workflow and artifact checks.
tests/test_prep_run_lmp.py	Added `TestMockedRunNvNMD` for mocked NVNMD run operator output validation.
tests/utils/test_dl_dpgen2_arti.py, tests/utils/test_dl_dpgen2_arti_by_def.py	Added `get` method to `MockedArti` for compatibility with updated artifact access.

Sequence Diagram(s)

NVNMD Training Workflow (High-Level)

sequenceDiagram
    participant User
    participant Workflow
    participant PrepRunDPTrain
    participant RunNvNMDTrain

    User->>Workflow: Submit concurrent learning workflow (train_style="dp-nvnmd")
    Workflow->>PrepRunDPTrain: Prepare training tasks
    PrepRunDPTrain->>RunNvNMDTrain: Execute NVNMD training (CNN/QNN)
    RunNvNMDTrain->>PrepRunDPTrain: Return trained model artifacts
    PrepRunDPTrain->>Workflow: Return training outputs
    Workflow->>User: Training results and artifacts

NVNMD LAMMPS Exploration Task

sequenceDiagram
    participant Workflow
    participant PrepRunLmp
    participant RunNvNMD

    Workflow->>PrepRunLmp: Prepare LAMMPS NVNMD exploration task
    PrepRunLmp->>RunNvNMD: Execute LAMMPS with NVNMD model(s)
    RunNvNMD->>PrepRunLmp: Return trajectory, model deviation, and logs
    PrepRunLmp->>Workflow: Return exploration outputs

Suggested reviewers

wanghan-iapcm
"""

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate Unit Tests

Create PR with Unit Tests
Post Copyable Unit Tests in Comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai auto-generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 18

🔭 Outside diff range comments (3)

dpgen2/flow/dpgen_loop.py (1)
500-503: ⚠️ Potential issue

Fix equality comparisons to True.

Avoid explicit equality comparisons to True as flagged by static analysis.

Apply this diff to fix the style issues:
     steps.outputs.parameters[
         "exploration_scheduler"
     ].value_from_expression = if_expression(
-        _if=(scheduler_step.outputs.parameters["converged"] == True),
+        _if=scheduler_step.outputs.parameters["converged"],
         _then=scheduler_step.outputs.parameters["exploration_scheduler"],
         _else=next_step.outputs.parameters["exploration_scheduler"],
     )
Apply similar changes to lines 513, 521, and 529.

Also applies to: 513-516, 521-524, 529-532

🧰 Tools

🪛 Ruff (0.11.9)

500-500: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)
dpgen2/exploration/task/lmp/lmp_input.py (2)
53-60: 🛠️ Refactor suggestion

Parameter needs typing + logic guard around electron-temperature check

nvnmd_version is introduced but

it is un-typed (Any). Add Optional[str] to make the API explicit.

The electron-temperature runtime guard just above (lines 57-63) still fires when nvnmd_version is set, although the ELE_TEMP path is later disabled.
That prevents users from mixing nvnmd with an old DeepMD build even though DeepMD is not in use.
-    deepmd_version="2.0",
-    nvnmd_version=None,
+    deepmd_version: str = "2.0",
+    nvnmd_version: Optional[str] = None,
 )
@@
-    if (ele_temp_f is not None or ele_temp_a is not None) and Version(
-        deepmd_version
-    ) < Version("1"):
+    # DeepMD-specific restriction – skip when running NVNMD
+    if (
+        nvnmd_version is None
+        and (ele_temp_f is not None or ele_temp_a is not None)
+        and Version(deepmd_version) < Version("1")
+    ):
140-149: ⚠️ Potential issue

Dump-file naming is inconsistent with later rerun

When trj_seperate_files is False (default) the dump is written to
dpgen.lammpstrj, but the rerun section looks for dpgen.lammpstrj.0.
This will raise ERROR: Cannot open dump file … in LAMMPS.
-        ret += "dump            1 all custom ${DUMP_FREQ} %s id type x y z fx fy fz\n" \
-            % lmp_traj_file_name
+        # keep the filename consistent with the subsequent `rerun`
+        dump_target = f"{lmp_traj_file_name}.0" if nvnmd_version else lmp_traj_file_name
+        ret += (
+            "dump            1 all custom ${DUMP_FREQ} %s id type x y z fx fy fz\n"
+            % dump_target
+        )

♻️ Duplicate comments (2)

dpgen2/exploration/task/make_task_group_from_config.py (2)

313-318: Same documentation concern as above.

661-670: Same hardcoded version issue.

🧹 Nitpick comments (34)

dpgen2/superop/__init__.py (1)
19-21: Suppress or export the newly-imported PrepRunNvNMDTrain to avoid Ruff F401

Ruff flags this symbol as “imported but unused”.
Two quick fixes:
-from .prep_run_nvnmd_train import (
-    PrepRunNvNMDTrain,
-)
+# noqa silences Ruff while keeping the import-side effect
+from .prep_run_nvnmd_train import (
+    PrepRunNvNMDTrain,  # noqa: F401
+)
or explicitly export it:
 __all__ = [
     ...,
+    "PrepRunNvNMDTrain",
 ]
Pick one to keep CI green and signal the public API.

🧰 Tools

🪛 Ruff (0.11.9)

20-20: .prep_run_nvnmd_train.PrepRunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)
dpgen2/op/__init__.py (1)
25-27: Mark new NVNMD operators as public or silence Ruff F401

PrepNvNMDTrain, RunNvNMD, and RunNvNMDTrain are imported only for side-effects / re-export.
Add them to __all__ or append # noqa: F401 to each line to prevent the unused-import warning.
-from .prep_nvnmd_train import (
-    PrepNvNMDTrain,
-)
+from .prep_nvnmd_train import (
+    PrepNvNMDTrain,  # noqa: F401
+)
 ...
-from .run_nvnmd import (
-    RunNvNMD,
-)
-from .run_nvnmd_train import (
-    RunNvNMDTrain,
-)
+from .run_nvnmd import (
+    RunNvNMD,  # noqa: F401
+)
+from .run_nvnmd_train import (
+    RunNvNMDTrain,  # noqa: F401
+)
Alternatively extend __all__:
__all__.extend(
    ["PrepNvNMDTrain", "RunNvNMD", "RunNvNMDTrain"]
)
Keeps static analysis quiet and clarifies the public surface.

Also applies to: 44-49

🧰 Tools

🪛 Ruff (0.11.9)

26-26: .prep_nvnmd_train.PrepNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)
dpgen2/utils/download_dpgen2_artifacts.py (1)
57-59: Checkpoint files are now downloaded flat – consider optional sub-foldering

The three new ckpt artifacts are added with no suffix, so they will be placed directly under
.../inputs and .../outputs.
If the number of artifacts keeps growing, grouping them (e.g. suffix="ckpts") could keep the
download directory tidy and backward-compatible:
-.add_input("init_models_ckpt_meta")
-.add_input("init_models_ckpt_data")
-.add_input("init_models_ckpt_index")
+ .add_input("init_models_ckpt_meta",  "ckpts")
+ .add_input("init_models_ckpt_data",  "ckpts")
+ .add_input("init_models_ckpt_index", "ckpts")
Same for outputs.
Not blocking, just a maintainability thought.

Also applies to: 64-66
dpgen2/exploration/task/npt_task_group.py (1)

52-54: Parameter nvnmd_version added – update docstring & default propagation

The docstring of set_md still omits the new nvnmd_version argument; please add it for
completeness.

self.nvnmd_version is forwarded to make_lmp_input, which expects the same kw-name
(nvnmd_version=...). Double-check that make_lmp_input indeed accepts this parameter; mismatched
names will raise TypeError at runtime.

No behavioural issues spotted otherwise.

Also applies to: 77-78, 137-138
dpgen2/entrypoint/args.py (3)
133-140: doc_config is unused – delete to silence Ruff F841
-    doc_numb_models = "Number of models trained for evaluating the model deviation"
-    doc_config = "Configuration of training"
+    doc_numb_models = "Number of models trained for evaluating the model deviation"
Removing the dead variable avoids the warning and keeps the doc-block tidy.

🧰 Tools

🪛 Ruff (0.11.9)

135-135: Local variable doc_config is assigned to but never used

Remove assignment to unused variable doc_config

(F841)

141-176: nvnmd_train_args() duplicates dp_train_args() – consider factoring

nvnmd_train_args() is almost a verbatim copy of dp_train_args() with only the
RunNvNMDTrain class swapped in. Duplicating ~40 lines now means two places to
patch for every future schema tweak.

A minimal refactor could delegate to a helper that injects the operator class:
def _generic_train_args(op_cls):
    base = dp_train_args()
    base[0] = Argument("config", dict, op_cls.training_args(),
                       optional=True,
                       default=op_cls.normalize_config({}),
                       doc="Configuration of training")
    return base

def nvnmd_train_args():
    return _generic_train_args(RunNvNMDTrain)
Keeps behaviour identical while eliminating code drift.

504-509: Docstring for "nvnmd" exploration still says “LAMMPS”

doc_lmp is re-used for the new "nvnmd" variant, which may mislead users
reading the generated docs/help. Provide a dedicated doc_nvnmd or amend
doc_lmp to cover both cases.
dpgen2/exploration/render/traj_render_lammps.py (1)
121-122: traj_fmt is now unused after the ASE rewrite

traj_fmt became dead code once the direct dpdata.System call was removed.
Drop it to avoid confusion.
-        traj_fmt = "lammps/dump"
🧰 Tools

🪛 Ruff (0.11.9)

121-121: Local variable traj_fmt is assigned to but never used

Remove assignment to unused variable traj_fmt

(F841)
dpgen2/op/prep_nvnmd_train.py (2)
7-11: Remove unused import Tuple

Tuple isn’t referenced in the module.
-from typing import (
-    List,
-    Tuple,
-    Union,
-)
+from typing import (
+    List,
+    Union,
+)
🧰 Tools

🪛 Ruff (0.11.9)

9-9: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

82-88: Prefer isinstance over type(...) != list

The current check trips static analysers (E721) and fails on subclasses:
-        if type(template) != list:
+        if not isinstance(template, list):
🧰 Tools

🪛 Ruff (0.11.9)

82-82: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)
tests/op/test_prep_nvnmd_train.py (1)
8-15: Prune unused imports to keep the test lean

numpy, several dflow.python symbols, and .context.dpgen2 are never used
inside this test module.
-import numpy as np
-from dflow.python import (
-    OP,
-    OPIO,
-    Artifact,
-    OPIOSign,
-)
-from .context import (
-    dpgen2,
-)
+from dflow.python import OPIO
Removing them silences Ruff F401 and speeds import time.

🧰 Tools

🪛 Ruff (0.11.9)

8-8: numpy imported but unused

Remove unused import: numpy

(F401)

10-10: dflow.python.OP imported but unused

Remove unused import

(F401)

12-12: dflow.python.Artifact imported but unused

Remove unused import

(F401)

13-13: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)
dpgen2/exploration/task/make_task_group_from_config.py (1)

300-306: Consider adding NVNMD-specific documentation.

The new variant uses doc_lmp_md documentation, which doesn't mention NVNMD functionality. Consider creating a separate documentation string that explains the NVNMD-specific behavior.
dpgen2/superop/block.py (1)
224-273: Consider refactoring to reduce code duplication.

While the implementation is correct, there's significant duplication between the two branches. The main differences are the step name and the additional checkpoint artifacts for PrepRunNvNMDTrain.

Consider extracting common parameters and artifacts:
+    # Common parameters
+    common_parameters = {
+        "block_id": block_steps.inputs.parameters["block_id"],
+        "train_config": block_steps.inputs.parameters["train_config"],
+        "numb_models": block_steps.inputs.parameters["numb_models"],
+        "template_script": block_steps.inputs.parameters["template_script"],
+        "run_optional_parameter": run_dp_train_optional_parameter,
+    }
+    
+    # Common artifacts
+    common_artifacts = {
+        "init_models": block_steps.inputs.artifacts["init_models"],
+        "init_data": block_steps.inputs.artifacts["init_data"],
+        "iter_data": block_steps.inputs.artifacts["iter_data"],
+    }
+
     if isinstance(prep_run_dp_train_op, PrepRunNvNMDTrain):
+        # Add checkpoint artifacts for NVNMD
+        nvnmd_artifacts = {
+            "init_models_ckpt_index": block_steps.inputs.artifacts["init_models_ckpt_index"],
+            "init_models_ckpt_data": block_steps.inputs.artifacts["init_models_ckpt_data"],
+            "init_models_ckpt_meta": block_steps.inputs.artifacts["init_models_ckpt_meta"],
+        }
         prep_run_dp_train = Step(
             name + "-prep-run-nvnmd-train",
             template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_models_ckpt_index": block_steps.inputs.artifacts[
-                    "init_models_ckpt_index"
-                ],
-                "init_models_ckpt_data": block_steps.inputs.artifacts[
-                    "init_models_ckpt_data"
-                ],
-                "init_models_ckpt_meta": block_steps.inputs.artifacts[
-                    "init_models_ckpt_meta"
-                ],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
+            parameters=common_parameters,
+            artifacts={**common_artifacts, **nvnmd_artifacts},
             key="--".join(
                 ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
             ),
         )
     else:
         prep_run_dp_train = Step(
             name + "-prep-run-dp-train",
             template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
+            parameters=common_parameters,
+            artifacts=common_artifacts,
             key="--".join(
                 ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
             ),
         )
tests/test_prep_run_nvnmd_train.py (1)
101-101: Convert lambda to function definition.

Replace the lambda expression with a proper function definition for better readability.
-    revised_fname = lambda ff: Path(ff).name if only_check_name else Path(ff)
+    def revised_fname(ff):
+        return Path(ff).name if only_check_name else Path(ff)
🧰 Tools

🪛 Ruff (0.11.9)

101-101: Do not assign a lambda expression, use a def

Rewrite revised_fname as a def

(E731)
tests/op/test_run_nvnmd_train.py (1)
609-625: Remove unused variable assignments in exception handling.

The variables ee and out are assigned but never used.
         ptrain = RunNvNMDTrain()
-        with self.assertRaises(FatalError) as ee:
-            out = ptrain.execute(
+        with self.assertRaises(FatalError):
+            ptrain.execute(
                 OPIO(
                     {
                         "config": config,
                         "task_name": task_name,
                         "task_path": Path(task_path),
                         "init_model": Path(self.init_model),
                         "init_model_ckpt_meta": Path(self.init_model_ckpt_meta),
                         "init_model_ckpt_data": Path(self.init_model_ckpt_data),
                         "init_model_ckpt_index": Path(self.init_model_ckpt_index),
                         "init_data": [Path(ii) for ii in self.init_data],
                         "iter_data": [Path(ii) for ii in self.iter_data],
                     }
                 )
             )
🧰 Tools

🪛 Ruff (0.11.9)

610-610: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

611-611: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)
dpgen2/flow/dpgen_loop.py (1)
458-495: Consider extracting checkpoint artifact checking logic.

The conditional logic for checking checkpoint artifacts is correct but verbose. Consider extracting this pattern into a helper function for better readability.

Add a helper function at the module level:
def _has_checkpoint_artifacts(artifacts):
    """Check if all checkpoint artifacts have path attributes."""
    return (
        hasattr(artifacts["models_ckpt_meta"]._from, "path")
        and hasattr(artifacts["models_ckpt_data"]._from, "path")
        and hasattr(artifacts["models_ckpt_index"]._from, "path")
    )
Then simplify the conditional:
-    if (
-        hasattr(block_step.outputs.artifacts["models_ckpt_meta"]._from, "path")
-        and hasattr(block_step.outputs.artifacts["models_ckpt_data"]._from, "path")
-        and hasattr(block_step.outputs.artifacts["models_ckpt_index"]._from, "path")
-    ):
+    if _has_checkpoint_artifacts(block_step.outputs.artifacts):
dpgen2/op/run_nvnmd.py (2)
14-15: Remove unused imports.

Several imports are not used in the code and should be removed to keep the module clean.

Remove these unused imports:
 from typing import (
     List,
     Optional,
-    Set,
-    Tuple,
 )

 from dargs import (
     Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )

 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
-    FatalError,
-    HDF5Datasets,
     OPIOSign,
     TransientError,
 )

 from dpgen2.constants import (
-    lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
Also applies to: 24-26, 33-34, 40-40

🧰 Tools

🪛 Ruff (0.11.9)

14-14: typing.Set imported but unused

Remove unused import

(F401)

15-15: typing.Tuple imported but unused

Remove unused import

(F401)

345-345: Remove unnecessary f-string prefix.

The string doesn't contain any placeholders.
-            raise RuntimeError(f"last matching index should not be -1, terribly wrong ")
+            raise RuntimeError("last matching index should not be -1, terribly wrong ")
🧰 Tools

🪛 Ruff (0.11.9)

345-345: f-string without any placeholders

Remove extraneous f prefix

(F541)
dpgen2/superop/prep_run_nvnmd_train.py (1)
1-1: Remove unused imports.

Multiple imports are not used and should be removed.

Remove these unused imports:
-import json
 import os
 from copy import (
     deepcopy,
 )
-from pathlib import (
-    Path,
-)
 from typing import (
     List,
     Optional,
-    Set,
     Type,
 )

 from dflow import (
     InputArtifact,
     InputParameter,
     Inputs,
     OutputArtifact,
     OutputParameter,
     Outputs,
     S3Artifact,
     Step,
     Steps,
-    Workflow,
     argo_len,
-    argo_range,
     argo_sequence,
-    download_artifact,
     upload_artifact,
 )
 from dflow.python import (
     OP,
-    OPIO,
-    Artifact,
-    BigParameter,
-    OPIOSign,
     PythonOPTemplate,
     Slices,
 )

 from dpgen2.constants import (
     train_index_pattern,
-    train_script_name,
-    train_task_pattern,
 )
Also applies to: 7-7, 12-12, 26-26, 28-28, 30-30, 35-38, 45-46

🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)
dpgen2/entrypoint/submit.py (1)
313-318: Simplify conditional by removing unnecessary elif.

The elif is unnecessary after a return statement.

Apply this diff:
     if explore_style in ("lmp", "nvnmd"):
         return make_lmp_naive_exploration_scheduler(config)
-    elif "calypso" in explore_style or explore_style == "diffcsp":
+    if "calypso" in explore_style or explore_style == "diffcsp":
         return make_naive_exploration_scheduler_without_conf(config, explore_style)
     else:
         raise KeyError(f"Unknown explore_style `{explore_style}`")
🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 313-318: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)
tests/mocked_ops.py (2)
413-546: Good implementation with minor cleanup opportunities.

The mocked NvNMD training operator correctly handles the specific directory structure and checkpoint files. However, there are some unused variables that could be cleaned up.

Remove or use the unused variables flagged by static analysis:
-        task_id = int(ip["task_name"].split(".")[1])
         assert ip["task_name"] in str(ip["task_path"])
         assert "model" in str(ip["init_model"])
         assert ".pb" in str(ip["init_model"])
And at line 448:
         init_model = init_model.resolve()
-        init_model_str = str(init_model)
🧰 Tools

🪛 Ruff (0.11.9)

436-436: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

448-448: Local variable init_model_str is assigned to but never used

Remove assignment to unused variable init_model_str

(F841)

🪛 Pylint (3.3.7)

[refactor] 415-415: Too many local variables (32/15)

(R0914)

[refactor] 415-415: Too many statements (100/50)

(R0915)

753-828: Good implementation with minor cleanup opportunities.

The mocked NvNMD operator correctly simulates LAMMPS execution with neural network models.

Remove unused variables:
-        task_id = int(ip["task_name"].split(".")[1])
         assert task_path.is_dir()
         assert ip["task_name"] in str(ip["task_path"])
And at line 780:
         models = [ii.resolve() for ii in models]
-        models_str = [str(ii) for ii in models]
🧰 Tools

🪛 Ruff (0.11.9)

765-765: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

780-780: Local variable models_str is assigned to but never used

Remove assignment to unused variable models_str

(F841)

🪛 Pylint (3.3.7)

[refactor] 755-755: Too many local variables (17/15)

(R0914)
tests/test_prep_run_nvnmd.py (2)
1-69: Clean up unused imports.

This file has many unused imports that should be removed for cleaner code.

Remove the following unused imports:
-import json
 import os
-import pickle
 import shutil
 import time
 import unittest
 from pathlib import (
     Path,
 )
 from typing import (
     List,
-    Set,
 )

-import jsonpickle
-import numpy as np
 from dflow import (
-    InputArtifact,
-    InputParameter,
-    Inputs,
-    OutputArtifact,
-    OutputParameter,
-    Outputs,
-    S3Artifact,
     Step,
-    Steps,
     Workflow,
-    argo_range,
     download_artifact,
     upload_artifact,
 )
 from dflow.python import (
-    OP,
     OPIO,
-    Artifact,
-    OPIOSign,
-    PythonOPTemplate,
 )
Also remove unused constants imports:
 from dpgen2.constants import (
     lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
     lmp_task_pattern,
     lmp_traj_name,
-    model_name_pattern,
-    train_log_name,
-    train_script_name,
-    train_task_pattern,
 )
🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)

3-3: pickle imported but unused

Remove unused import: pickle

(F401)

12-12: typing.Set imported but unused

Remove unused import: typing.Set

(F401)

15-15: jsonpickle imported but unused

Remove unused import: jsonpickle

(F401)

16-16: numpy imported but unused

Remove unused import: numpy

(F401)

18-18: dflow.InputArtifact imported but unused

Remove unused import

(F401)

19-19: dflow.InputParameter imported but unused

Remove unused import

(F401)

20-20: dflow.Inputs imported but unused

Remove unused import

(F401)

21-21: dflow.OutputArtifact imported but unused

Remove unused import

(F401)

22-22: dflow.OutputParameter imported but unused

Remove unused import

(F401)

23-23: dflow.Outputs imported but unused

Remove unused import

(F401)

24-24: dflow.S3Artifact imported but unused

Remove unused import

(F401)

26-26: dflow.Steps imported but unused

Remove unused import

(F401)

28-28: dflow.argo_range imported but unused

Remove unused import

(F401)

33-33: dflow.python.OP imported but unused

Remove unused import

(F401)

35-35: dflow.python.Artifact imported but unused

Remove unused import

(F401)

36-36: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

37-37: dflow.python.PythonOPTemplate imported but unused

Remove unused import

(F401)

40-46: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

42-42: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)

66-66: dpgen2.constants.model_name_pattern imported but unused

Remove unused import

(F401)

67-67: dpgen2.constants.train_log_name imported but unused

Remove unused import

(F401)

68-68: dpgen2.constants.train_script_name imported but unused

Remove unused import

(F401)

69-69: dpgen2.constants.train_task_pattern imported but unused

Remove unused import

(F401)

40-46: Use contextlib.suppress for cleaner exception handling.

Replace the try-except-pass pattern:
-try:
-    from context import (
-        dpgen2,
-    )
-except ModuleNotFoundError:
-    # case of upload everything to argo, no context needed
-    pass
+from contextlib import suppress
+
+with suppress(ModuleNotFoundError):
+    from context import dpgen2
🧰 Tools

🪛 Ruff (0.11.9)

40-46: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

42-42: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)
tests/op/test_run_nvnmd.py (4)
1-26: Clean up unused imports.

Remove unused imports:
-import json
 import os
 import shutil
 import unittest
 from pathlib import (
     Path,
 )

 import dpdata
 import numpy as np
 from dflow.python import (
-    OP,
     OPIO,
-    Artifact,
-    OPIOSign,
     TransientError,
 )
 from mock import (
     call,
-    mock,
     patch,
 )

 # isort: off
 from .context import (
-    dpgen2,
 )
🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

15-15: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

20-20: mock.mock imported but unused

Remove unused import: mock.mock

(F401)

26-26: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

130-145: Remove unused variables in exception handling.
     def test_error(self, mocked_run):
         mocked_run.side_effect = [(1, "foo\n", "")]
         op = RunNvNMD()
-        with self.assertRaises(TransientError) as ee:
-            out = op.execute(
+        with self.assertRaises(TransientError):
+            op.execute(
                 OPIO(
                     {
                         "config": {"command": "mylmp"},
                         "task_name": self.task_name,
                         "task_path": self.task_path,
                         "models": self.models,
                     }
                 )
             )
🧰 Tools

🪛 Ruff (0.11.9)

134-134: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

135-135: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

270-270: Remove unnecessary f-string.
-            (work_dir / (model_name_pattern % ii)).read_text(), f"teacher model"
+            (work_dir / (model_name_pattern % ii)).read_text(), "teacher model"
🧰 Tools

🪛 Ruff (0.11.9)

270-270: f-string without any placeholders

Remove extraneous f prefix

(F541)

304-319: Remove unused exception variables.

In both test methods:
-        with self.assertRaises(RuntimeError) as re:
+        with self.assertRaises(RuntimeError):
             set_models(input_name, self.model_names)
🧰 Tools

🪛 Ruff (0.11.9)

309-309: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

317-317: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)
dpgen2/op/run_nvnmd_train.py (4)
1-49: Clean up unused imports.

Remove unused imports:
 from typing import (
     Dict,
     List,
     Optional,
-    Tuple,
     Union,
 )

 import dpdata
 from dargs import (
     Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
     FatalError,
     NestedDict,
     OPIOSign,
     Parameter,
-    TransientError,
 )

 from dpgen2.constants import (
     train_cnn_script_name,
     train_qnn_script_name,
     train_script_name,
-    train_task_pattern,
 )
🧰 Tools

🪛 Ruff (0.11.9)

14-14: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

21-21: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

22-22: dargs.Variant imported but unused

Remove unused import

(F401)

23-23: dargs.dargs imported but unused

Remove unused import

(F401)

34-34: dflow.python.TransientError imported but unused

Remove unused import: dflow.python.TransientError

(F401)

41-41: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

196-199: Remove unused variable.
         config = ip["config"] if ip["config"] is not None else {}
         dp_command = ip["config"].get("command", "dp").split()
-        train_args = config.get("train_args", "")
         config = RunNvNMDTrain.normalize_config(config)
🧰 Tools

🪛 Ruff (0.11.9)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

219-223: Use ternary operator for cleaner code.
-        if "systems" in train_dict["training"]:
-            major_version = "1"
-        else:
-            major_version = "2"
+        major_version = "1" if "systems" in train_dict["training"] else "2"
🧰 Tools

🪛 Ruff (0.11.9)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

155-388: Consider refactoring to reduce complexity.

The execute method has high complexity (45 local variables, 83 statements). While this might be necessary for the two-stage training process, consider extracting some logic into helper methods.

Would you like me to help refactor this method by extracting the CNN and QNN training logic into separate methods? This could improve readability and maintainability.

🧰 Tools

🪛 Ruff (0.11.9)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

272-272: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 155-155: Too many local variables (45/15)

(R0914)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 155-155: Too many statements (83/50)

(R0915)
dpgen2/exploration/task/lmp/lmp_input.py (2)
110-118: nvnmd_version value is ignored – either use it or demote it to a boolean

The string passed via nvnmd_version is never interpolated; the code only checks for non-None.
If different NVNMD formats (e.g. pair_style nvnmd${nvnmd_version}) are ever needed, the value will be lost.
If versioning is not required, rename the parameter to use_nvnmd: bool to avoid confusion.
-elif nvnmd_version is not None:
-    ret += "pair_style      nvnmd %s\n" % ("model.pb")
+elif nvnmd_version is not None:
+    # honour explicit version if supplied, fallback to latest
+    model_file = f"model_{nvnmd_version}.pb" if nvnmd_version else "model.pb"
+    ret += f"pair_style      nvnmd {model_file}\n"
201-205: Execution flow: unreachable rerun block when rerun==0

The unconditional jump SELF end right after run guarantees the rerun
label is skipped when rerun==0, which is correct, but it also prevents
fall-through when jump SELF rerun is taken earlier (the script re-reads
from the top). Make this explicit to aid maintenance, e.g. comment or:
-    if nvnmd_version is not None:
-        ret += "jump SELF end\n"
+    if nvnmd_version is not None:
+        # ensure we land in the correct section after the MAIN run
+        ret += "jump SELF end\n"

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53d93f9 and d7210b9.

📒 Files selected for processing (24)

dpgen2/constants.py (1 hunks)
dpgen2/entrypoint/args.py (3 hunks)
dpgen2/entrypoint/submit.py (9 hunks)
dpgen2/exploration/render/traj_render_lammps.py (2 hunks)
dpgen2/exploration/task/lmp/lmp_input.py (6 hunks)
dpgen2/exploration/task/lmp_template_task_group.py (5 hunks)
dpgen2/exploration/task/make_task_group_from_config.py (4 hunks)
dpgen2/exploration/task/npt_task_group.py (3 hunks)
dpgen2/flow/dpgen_loop.py (8 hunks)
dpgen2/op/__init__.py (2 hunks)
dpgen2/op/prep_nvnmd_train.py (1 hunks)
dpgen2/op/run_nvnmd.py (1 hunks)
dpgen2/op/run_nvnmd_train.py (1 hunks)
dpgen2/superop/__init__.py (1 hunks)
dpgen2/superop/block.py (7 hunks)
dpgen2/superop/prep_run_nvnmd_train.py (1 hunks)
dpgen2/utils/download_dpgen2_artifacts.py (1 hunks)
tests/mocked_ops.py (5 hunks)
tests/op/test_prep_nvnmd_train.py (1 hunks)
tests/op/test_run_nvnmd.py (1 hunks)
tests/op/test_run_nvnmd_train.py (1 hunks)
tests/test_prep_run_nvnmd.py (1 hunks)
tests/test_prep_run_nvnmd_train.py (1 hunks)
tests/utils/test_dl_dpgen2_arti.py (4 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (4)

dpgen2/superop/__init__.py (1)

dpgen2/superop/prep_run_nvnmd_train.py (1)

PrepRunNvNMDTrain (57-153)

dpgen2/op/__init__.py (3)

dpgen2/op/prep_nvnmd_train.py (1)

PrepNvNMDTrain (27-119)

dpgen2/op/run_nvnmd.py (1)

RunNvNMD (62-313)

dpgen2/op/run_nvnmd_train.py (1)

RunNvNMDTrain (103-626)

dpgen2/exploration/render/traj_render_lammps.py (1)

dpgen2/fp/vasp.py (2)

set_ele_temp (103-125)

set_ele_temp (176-188)

dpgen2/superop/block.py (2)

dpgen2/superop/prep_run_nvnmd_train.py (1)

PrepRunNvNMDTrain (57-153)

dpgen2/superop/prep_run_dp_train.py (1)

PrepRunDPTrain (57-146)

🪛 Ruff (0.11.9)

dpgen2/superop/__init__.py

20-20: .prep_run_nvnmd_train.PrepRunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

dpgen2/op/__init__.py

26-26: .prep_nvnmd_train.PrepNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

45-45: .run_nvnmd.RunNvNMD imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

48-48: .run_nvnmd_train.RunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

dpgen2/entrypoint/args.py

135-135: Local variable doc_config is assigned to but never used

Remove assignment to unused variable doc_config

(F841)

dpgen2/superop/prep_run_nvnmd_train.py

1-1: json imported but unused

Remove unused import: json

(F401)

7-7: pathlib.Path imported but unused

Remove unused import: pathlib.Path

(F401)

12-12: typing.Set imported but unused

Remove unused import: typing.Set

(F401)

26-26: dflow.Workflow imported but unused

Remove unused import

(F401)

28-28: dflow.argo_range imported but unused

Remove unused import

(F401)

30-30: dflow.download_artifact imported but unused

Remove unused import

(F401)

35-35: dflow.python.OPIO imported but unused

Remove unused import

(F401)

36-36: dflow.python.Artifact imported but unused

Remove unused import

(F401)

37-37: dflow.python.BigParameter imported but unused

Remove unused import

(F401)

38-38: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

45-45: dpgen2.constants.train_script_name imported but unused

Remove unused import

(F401)

46-46: dpgen2.constants.train_task_pattern imported but unused

Remove unused import

(F401)

161-161: Do not perform function call normalize_step_dict in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

162-162: Do not perform function call normalize_step_dict in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

dpgen2/op/prep_nvnmd_train.py

9-9: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

82-82: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

tests/op/test_run_nvnmd.py

1-1: json imported but unused

Remove unused import: json

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

15-15: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

20-20: mock.mock imported but unused

Remove unused import: mock.mock

(F401)

26-26: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

134-134: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

135-135: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

243-243: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

270-270: f-string without any placeholders

Remove extraneous f prefix

(F541)

309-309: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

317-317: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

tests/op/test_prep_nvnmd_train.py

8-8: numpy imported but unused

Remove unused import: numpy

(F401)

10-10: dflow.python.OP imported but unused

Remove unused import

(F401)

12-12: dflow.python.Artifact imported but unused

Remove unused import

(F401)

13-13: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

21-21: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

167-167: Local variable op is assigned to but never used

Remove assignment to unused variable op

(F841)

dpgen2/flow/dpgen_loop.py

513-513: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

521-521: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

529-529: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

tests/op/test_run_nvnmd_train.py

1-1: itertools imported but unused

Remove unused import: itertools

(F401)

3-3: os imported but unused

Remove unused import: os

(F401)

10-10: numpy imported but unused

Remove unused import: numpy

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

16-16: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

17-17: dflow.python.TransientError imported but unused

Remove unused import

(F401)

30-30: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

36-36: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

41-41: dpgen2.op.run_nvnmd_train._make_train_command imported but unused

Remove unused import: dpgen2.op.run_nvnmd_train._make_train_command

(F401)

610-610: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

611-611: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

tests/test_prep_run_nvnmd.py

1-1: json imported but unused

Remove unused import: json

(F401)

3-3: pickle imported but unused

Remove unused import: pickle

(F401)

12-12: typing.Set imported but unused

Remove unused import: typing.Set

(F401)

15-15: jsonpickle imported but unused

Remove unused import: jsonpickle

(F401)

16-16: numpy imported but unused

Remove unused import: numpy

(F401)

18-18: dflow.InputArtifact imported but unused

Remove unused import

(F401)

19-19: dflow.InputParameter imported but unused

Remove unused import

(F401)

20-20: dflow.Inputs imported but unused

Remove unused import

(F401)

21-21: dflow.OutputArtifact imported but unused

Remove unused import

(F401)

22-22: dflow.OutputParameter imported but unused

Remove unused import

(F401)

23-23: dflow.Outputs imported but unused

Remove unused import

(F401)

24-24: dflow.S3Artifact imported but unused

Remove unused import

(F401)

26-26: dflow.Steps imported but unused

Remove unused import

(F401)

28-28: dflow.argo_range imported but unused

Remove unused import

(F401)

33-33: dflow.python.OP imported but unused

Remove unused import

(F401)

35-35: dflow.python.Artifact imported but unused

Remove unused import

(F401)

36-36: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

37-37: dflow.python.PythonOPTemplate imported but unused

Remove unused import

(F401)

40-46: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

42-42: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)

66-66: dpgen2.constants.model_name_pattern imported but unused

Remove unused import

(F401)

67-67: dpgen2.constants.train_log_name imported but unused

Remove unused import

(F401)

68-68: dpgen2.constants.train_script_name imported but unused

Remove unused import

(F401)

69-69: dpgen2.constants.train_task_pattern imported but unused

Remove unused import

(F401)

tests/test_prep_run_nvnmd_train.py

1-1: json imported but unused

Remove unused import: json

(F401)

10-10: typing.List imported but unused

Remove unused import

(F401)

11-11: typing.Set imported but unused

Remove unused import

(F401)

14-14: numpy imported but unused

Remove unused import: numpy

(F401)

16-16: dflow.InputArtifact imported but unused

Remove unused import

(F401)

17-17: dflow.InputParameter imported but unused

Remove unused import

(F401)

18-18: dflow.Inputs imported but unused

Remove unused import

(F401)

19-19: dflow.OutputArtifact imported but unused

Remove unused import

(F401)

20-20: dflow.OutputParameter imported but unused

Remove unused import

(F401)

21-21: dflow.Outputs imported but unused

Remove unused import

(F401)

22-22: dflow.S3Artifact imported but unused

Remove unused import

(F401)

24-24: dflow.Steps imported but unused

Remove unused import

(F401)

26-26: dflow.argo_range imported but unused

Remove unused import

(F401)

31-31: dflow.python.OP imported but unused

Remove unused import

(F401)

33-33: dflow.python.Artifact imported but unused

Remove unused import

(F401)

34-34: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

35-35: dflow.python.PythonOPTemplate imported but unused

Remove unused import

(F401)

38-44: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

40-40: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)

101-101: Do not assign a lambda expression, use a def

Rewrite revised_fname as a def

(E731)

dpgen2/op/run_nvnmd.py

14-14: typing.Set imported but unused

Remove unused import

(F401)

15-15: typing.Tuple imported but unused

Remove unused import

(F401)

24-24: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

25-25: dargs.Variant imported but unused

Remove unused import

(F401)

26-26: dargs.dargs imported but unused

Remove unused import

(F401)

33-33: dflow.python.FatalError imported but unused

Remove unused import

(F401)

34-34: dflow.python.HDF5Datasets imported but unused

Remove unused import

(F401)

40-40: dpgen2.constants.lmp_conf_name imported but unused

Remove unused import: dpgen2.constants.lmp_conf_name

(F401)

155-155: Do not use bare except

(E722)

166-166: Do not use bare except

(E722)

308-308: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

345-345: f-string without any placeholders

Remove extraneous f prefix

(F541)

dpgen2/op/run_nvnmd_train.py

14-14: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

21-21: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

22-22: dargs.Variant imported but unused

Remove unused import

(F401)

23-23: dargs.dargs imported but unused

Remove unused import

(F401)

34-34: dflow.python.TransientError imported but unused

Remove unused import: dflow.python.TransientError

(F401)

41-41: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

272-272: Use a context manager for opening files

(SIM115)

619-619: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

tests/mocked_ops.py

436-436: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

448-448: Local variable init_model_str is assigned to but never used

Remove assignment to unused variable init_model_str

(F841)

588-588: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

765-765: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

780-780: Local variable models_str is assigned to but never used

Remove assignment to unused variable models_str

(F841)

🪛 Pylint (3.3.7)

dpgen2/superop/prep_run_nvnmd_train.py

[refactor] 58-58: Too many arguments (9/5)

(R0913)

[refactor] 58-58: Too many positional arguments (9/5)

(R0917)

[refactor] 156-156: Too many arguments (9/5)

(R0913)

[refactor] 156-156: Too many positional arguments (9/5)

(R0917)

[refactor] 156-156: Too many local variables (16/15)

(R0914)

tests/op/test_prep_nvnmd_train.py

[refactor] 55-55: Too few public methods (1/2)

(R0903)

tests/op/test_run_nvnmd_train.py

[refactor] 47-47: Too many instance attributes (27/7)

(R0902)

[refactor] 638-638: Too many instance attributes (15/7)

(R0902)

tests/test_prep_run_nvnmd_train.py

[refactor] 80-80: Too many arguments (9/5)

(R0913)

[refactor] 80-80: Too many positional arguments (9/5)

(R0917)

[refactor] 198-198: Too many arguments (8/5)

(R0913)

[refactor] 198-198: Too many positional arguments (8/5)

(R0917)

[refactor] 269-269: Too many instance attributes (9/7)

(R0902)

[refactor] 371-371: Too many instance attributes (15/7)

(R0902)

dpgen2/op/run_nvnmd.py

[refactor] 97-97: Too many local variables (29/15)

(R0914)

[refactor] 97-97: Too many branches (14/12)

(R0912)

[refactor] 97-97: Too many statements (63/50)

(R0915)

[refactor] 369-372: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)

[refactor] 438-438: Too many local variables (19/15)

(R0914)

dpgen2/op/run_nvnmd_train.py

[refactor] 155-155: Too many local variables (45/15)

(R0914)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 155-155: Too many statements (83/50)

(R0915)

[refactor] 391-391: Too many arguments (7/5)

(R0913)

[refactor] 391-391: Too many positional arguments (7/5)

(R0917)

[refactor] 430-430: Too many branches (14/12)

(R0912)

[refactor] 482-494: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

tests/mocked_ops.py

[refactor] 415-415: Too many local variables (32/15)

(R0914)

[refactor] 415-415: Too many statements (100/50)

(R0915)

[refactor] 567-567: Too many local variables (27/15)

(R0914)

[refactor] 567-567: Too many statements (76/50)

(R0915)

[refactor] 755-755: Too many local variables (17/15)

(R0914)

dpgen2/entrypoint/submit.py

[refactor] 313-318: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

🔇 Additional comments (25)

tests/utils/test_dl_dpgen2_arti.py (1)

69-83: Extended mocks keep the test in lock-step with new artifacts – looks good

The additional expectations for *_ckpt_{meta,data,index} align with the
download-settings change. Assertions remain deterministic and fast via the mocked
download_artifact.

Also applies to: 104-118, 300-314, 335-349
dpgen2/constants.py (1)

9-12: Checkpoint path pattern may not match TensorFlow’s default file layout

The constants hard-code "model.ckpt.%03d/model.ckpt.{meta|data|index}".
TensorFlow normally produces
model.ckpt-00001.meta
model.ckpt-00001.index
model.ckpt-00001.data-00000-of-00001
i.e. the filename, not the parent directory, carries the run-id. Storing the id twice (…%03d/model.ckpt.*) risks pointing to a non-existent path (model.ckpt.001/model.ckpt.meta). Please double-check the actual output of tf.train.Checkpoint and adjust the patterns accordingly.
dpgen2/exploration/task/lmp/lmp_input.py (5)

53-53: LGTM!

The addition of the optional nvnmd_version parameter maintains backward compatibility.

73-76: Consistent handling of electron temperature variables.

The suppression of ELE_TEMP variables when nvnmd_version is provided is implemented consistently for both frame-style and atom-style electron temperatures.

140-141: Appropriate handling of trajectory dumps for NVNMD.

The modification correctly disables separate trajectory files when using NVNMD, ensuring compatibility with NVNMD's trajectory requirements.

152-153: LGTM!

The conditional jump correctly enables the rerun workflow when using NVNMD.

201-205: Well-structured NVNMD rerun workflow.

The implementation correctly adds LAMMPS control flow for NVNMD rerun simulations with proper labels and trajectory handling.

dpgen2/superop/block.py (6)

67-69: LGTM!

Import is properly organized with related imports.

94-94: Proper type annotation for extended functionality.

The Union type correctly captures that the parameter can accept either training operation type.

119-122: Well-structured checkpoint artifact inputs.

The optional checkpoint artifacts properly represent the three components of a TensorFlow checkpoint (index, data, meta files).

130-133: Consistent checkpoint output artifacts.

Output artifacts properly mirror the input structure for checkpoint handling.

285-287: Correct conditional model artifact selection.

The implementation properly selects the appropriate model artifact based on the training operation type.

366-375: Proper conditional output artifact linking.

The implementation correctly links checkpoint artifacts only when using PrepRunNvNMDTrain, maintaining compatibility with the standard training workflow.

dpgen2/flow/dpgen_loop.py (2)

189-191: LGTM! Checkpoint artifacts properly added.

The optional checkpoint artifacts are correctly added to support NvNMD model checkpointing.

619-644: Checkpoint artifact handling is consistent.

The checkpoint artifact propagation in _dpgen follows the same pattern as in _loop, maintaining consistency throughout the module.

dpgen2/entrypoint/submit.py (1)

189-199: LGTM! NvNMD training style properly integrated.

The dp-nvnmd training style is correctly integrated following the same pattern as the existing dp training style.

Also applies to: 535-543

dpgen2/exploration/task/lmp_template_task_group.py (3)

50-77: LGTM! nvnmd_version parameter properly integrated.

The nvnmd_version parameter is correctly added and propagated through the relevant methods to enable NVNMD-specific behavior.

179-192: Verify hardcoded model name for nvnmd.

The implementation correctly switches between deepmd and nvnmd pair styles. However, the model name is hardcoded to "model.pb" for nvnmd.

Is the hardcoded "model.pb" filename intentional for nvnmd? Should this be configurable or derived from the model list?

223-229: LGTM! Rerun commands correctly implemented.

The rerun commands properly set up the NVNMD rerun workflow with appropriate labels and trajectory file references.

tests/mocked_ops.py (2)

43-46: LGTM! New checkpoint pattern imports for NvNMD support.

The imports are properly organized and align with the new checkpoint handling functionality.

128-141: LGTM! Well-structured checkpoint creation for testing.

The function properly creates the checkpoint directory structure with all required files (meta, data, index) following the established patterns.

tests/test_prep_run_nvnmd.py (1)

92-308: Well-structured test implementations!

The test classes provide comprehensive coverage:

TestPrepLmp: Validates task directory creation

TestMockedRunNvNMD: Tests the mocked NvNMD operator execution

TestPrepRunNvNMD: Integration test with workflow submission

All tests have proper setup/teardown and validate expected outputs correctly.

tests/op/test_run_nvnmd.py (1)

49-401: Excellent test coverage for RunNvNMD operations!

The test suite provides comprehensive coverage:

Command construction and execution mocking

Error handling scenarios

Knowledge distillation support

Utility function testing (set_models, get_ele_temp, merge_pimd_files)

Proper setup/teardown and assertions

The tests effectively validate the RunNvNMD operator functionality.

🧰 Tools

🪛 Ruff (0.11.9)

134-134: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

135-135: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

243-243: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

270-270: f-string without any placeholders

Remove extraneous f prefix

(F541)

309-309: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

317-317: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

dpgen2/op/run_nvnmd_train.py (2)

51-101: LGTM! Well-structured command construction.

The function properly handles:

Checkpoint restart scenarios

Initialization from frozen models vs checkpoint files

Command argument construction

Good separation of concerns.

629-668: LGTM! Well-implemented utility functions.

The data size calculation and directory expansion functions are clean and serve their purposes well:

Proper handling of single and multi-system data

Support for mixed type systems

Clear separation of concerns

coderabbitai · 2025-06-11T13:35:01Z

dpgen2/exploration/render/traj_render_lammps.py

                    traj = StringIO(trajs[ii].get_data())  # type: ignore
                else:
                    traj = trajs[ii]
-                ss = dpdata.System(traj, fmt=traj_fmt, type_map=type_map)
-                ss.nopbc = self.nopbc
-                if ele_temp:
-                    self.set_ele_temp(ss, ele_temp[ii])
-                ss = ss.sub_system(id_selected[ii])
-                ms.append(ss)
+                # ss = dpdata.System(traj, fmt=traj_fmt, type_map=type_map)
+                ss = read(
+                    str(traj), format="lammps-dump-text", index=":", specorder=type_map
+                )
+                for jj in id_selected[ii]:
+                    s = dpdata.System(ss[jj], fmt="ase/structure", type_map=type_map)


⚠️ Potential issue

StringIO handling is broken – ase.io.read(str(traj) …) receives “<_io.StringIO…>”

read() accepts a file-like object; converting the StringIO to str passes a
representation instead of the data and will raise “No such file or directory”.
Fix:

- ss = read( - str(traj), format="lammps-dump-text", index=":", specorder=type_map - ) + ss = read( + traj, # pass the file-object directly + format="lammps-dump-text", + index=":", + specorder=type_map, + )

This keeps the code path for on-disk files unchanged while restoring support
for in-memory datasets.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

traj = StringIO(trajs[ii].get_data()) # type: ignore

else:

traj = trajs[ii]

ss = dpdata.System(traj, fmt=traj_fmt, type_map=type_map)

ss.nopbc = self.nopbc

if ele_temp:

self.set_ele_temp(ss, ele_temp[ii])

ss = ss.sub_system(id_selected[ii])

ms.append(ss)

# ss = dpdata.System(traj, fmt=traj_fmt, type_map=type_map)

ss = read(

str(traj), format="lammps-dump-text", index=":", specorder=type_map

)

for jj in id_selected[ii]:

s = dpdata.System(ss[jj], fmt="ase/structure", type_map=type_map)

traj = StringIO(trajs[ii].get_data()) # type: ignore

else:

traj = trajs[ii]

# ss = dpdata.System(traj, fmt=traj_fmt, type_map=type_map)

ss = read(

traj, # pass the file-object directly

format="lammps-dump-text",

index=":",

specorder=type_map,

)

for jj in id_selected[ii]:

s = dpdata.System(ss[jj], fmt="ase/structure", type_map=type_map)

🤖 Prompt for AI Agents

In dpgen2/exploration/render/traj_render_lammps.py around lines 126 to 134, the code incorrectly converts a StringIO object to a string before passing it to ase.io.read(), which causes it to receive a string representation of the object instead of the file content, leading to an error. To fix this, pass the StringIO object directly to read() without converting it to a string, so that read() receives a proper file-like object for in-memory data, while leaving the on-disk file handling unchanged.

coderabbitai · 2025-06-11T13:35:02Z

dpgen2/op/prep_nvnmd_train.py

+        # the key "seed" in "nvnmd" is used to set the random seed for the network parameters, it is developing.
+        jtmp["nvnmd"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        jtmp["training"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        return jtmp


🛠️ Refactor suggestion

Guard against missing keys

jtmp["nvnmd"]["seed"] will raise KeyError if the template lacks the
nvnmd/seed hierarchy. Consider:

jtmp.setdefault("nvnmd", {})["seed"] = random.getrandbits(32) jtmp.setdefault("training", {})["seed"] = random.getrandbits(32)

Ensures robustness for user-supplied templates.

🤖 Prompt for AI Agents

In dpgen2/op/prep_nvnmd_train.py around lines 116 to 119, the code assigns random seeds to jtmp["nvnmd"]["seed"] and jtmp["training"]["seed"] without checking if the keys "nvnmd" and "training" exist, which can cause KeyError. To fix this, use setdefault to ensure these keys exist before assignment, like jtmp.setdefault("nvnmd", {})["seed"] = random.getrandbits(32) and similarly for "training". This will make the code robust against missing keys in user-supplied templates.

coderabbitai · 2025-06-11T13:35:02Z

dpgen2/op/prep_nvnmd_train.py

+        jtmp = input_dict.copy()
+
+        # the key "seed" in "nvnmd" is used to set the random seed for the network parameters, it is developing.
+        jtmp["nvnmd"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        jtmp["training"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        return jtmp


⚠️ Potential issue

Shallow copy mutates caller template; use deepcopy

dict.copy() copies only the first level; nested dicts ("nvnmd", "training")
are shared, so modifying seeds on iteration 0 mutates the template used for
iteration 1.

-import json +import json, copy ... - jtmp = input_dict.copy() + jtmp = copy.deepcopy(input_dict)

Alternatively, rebuild only the mutated sub-trees.

🤖 Prompt for AI Agents

In dpgen2/op/prep_nvnmd_train.py around lines 114 to 119, the use of dict.copy() creates a shallow copy, causing nested dictionaries like "nvnmd" and "training" to be shared and mutated across iterations. Replace dict.copy() with copy.deepcopy() to create a full independent copy of the input_dict, preventing unintended mutations of the original template when setting seeds.

coderabbitai · 2025-06-11T13:35:02Z

dpgen2/exploration/task/lmp/lmp_input.py

+    elif nvnmd_version is not None:
+        ret += "pair_style      nvnmd %s\n" % ("model.pb")
    else:


⚠️ Potential issue

Consider using a configurable model name instead of hardcoding "model.pb".

The NVNMD pair style uses a hardcoded model filename, while the DeepMD branches use the graph_list variable. This reduces flexibility for users who might want to use different model names.

Apply this diff to use the first model from the graphs list:

elif nvnmd_version is not None: - ret += "pair_style nvnmd %s\n" % ("model.pb") + ret += "pair_style nvnmd %s\n" % (graphs[0] if graphs else "model.pb")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

elif nvnmd_version is not None:

ret += "pair_style nvnmd %s\n" % ("model.pb")

else:

elif nvnmd_version is not None:

ret += "pair_style nvnmd %s\n" % (graphs[0] if graphs else "model.pb")

else:

🤖 Prompt for AI Agents

In dpgen2/exploration/task/lmp/lmp_input.py around lines 116 to 118, the code hardcodes the model filename "model.pb" for the NVNMD pair style, reducing flexibility. Modify the code to use the first model name from the existing graphs list variable instead of the hardcoded string. This will make the model name configurable and consistent with the DeepMD branches.

coderabbitai · 2025-06-11T13:35:02Z

dpgen2/exploration/task/make_task_group_from_config.py

+    elif config["type"] == "lmp-nvnmd":
+        tgroup = NPTTaskGroup()
+        config.pop("type")
+        config["nvnmd_version"] = "0.0"
+        tgroup.set_md(
+            numb_models,
+            mass_map,
+            **config,
+        )


⚠️ Potential issue

Hardcoded NVNMD version limits flexibility.

The NVNMD version is hardcoded to "0.0", which doesn't allow users to specify different versions. Consider making this configurable through the task group configuration.

Would you like me to suggest an implementation that allows users to configure the NVNMD version through the task group configuration?

🤖 Prompt for AI Agents

In dpgen2/exploration/task/make_task_group_from_config.py around lines 643 to 651, the NVNMD version is hardcoded to "0.0", limiting flexibility. Modify the code to check if "nvnmd_version" is provided in the config and use that value; if not present, default to "0.0". This allows users to specify the NVNMD version via the task group configuration instead of it being fixed.

coderabbitai · 2025-06-11T13:35:03Z

tests/mocked_ops.py

+        return OPIO(
+            {
+                "script": work_dir / oscript,
+                "cnn_model": work_dir / cnn_model,
+                "qnn_model": work_dir / qnn_model,
+                "model_ckpt_data": work_dir / model_ckpt_meta_file,
+                "model_ckpt_meta": work_dir / model_ckpt_meta_file,
+                "model_ckpt_index": work_dir / model_ckpt_meta_file,
+                "lcurve": work_dir / lcurve,
+                "log": work_dir / log,
+            }
+        )


⚠️ Potential issue

Critical: Incorrect output file assignments.

All three checkpoint output files are incorrectly assigned to model_ckpt_meta_file:

Fix the output assignments:

return OPIO( { "script": work_dir / oscript, "cnn_model": work_dir / cnn_model, "qnn_model": work_dir / qnn_model, - "model_ckpt_data": work_dir / model_ckpt_meta_file, + "model_ckpt_data": work_dir / model_ckpt_data_file, "model_ckpt_meta": work_dir / model_ckpt_meta_file, - "model_ckpt_index": work_dir / model_ckpt_meta_file, + "model_ckpt_index": work_dir / model_ckpt_index_file, "lcurve": work_dir / lcurve, "log": work_dir / log, } )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return OPIO(

{

"script": work_dir / oscript,

"cnn_model": work_dir / cnn_model,

"qnn_model": work_dir / qnn_model,

"model_ckpt_data": work_dir / model_ckpt_meta_file,

"model_ckpt_meta": work_dir / model_ckpt_meta_file,

"model_ckpt_index": work_dir / model_ckpt_meta_file,

"lcurve": work_dir / lcurve,

"log": work_dir / log,

}

)

return OPIO(

{

"script": work_dir / oscript,

"cnn_model": work_dir / cnn_model,

"qnn_model": work_dir / qnn_model,

"model_ckpt_data": work_dir / model_ckpt_data_file,

"model_ckpt_meta": work_dir / model_ckpt_meta_file,

"model_ckpt_index": work_dir / model_ckpt_index_file,

"lcurve": work_dir / lcurve,

"log": work_dir / log,

}

)

🤖 Prompt for AI Agents

In tests/mocked_ops.py between lines 662 and 673, the checkpoint output files model_ckpt_data, model_ckpt_meta, and model_ckpt_index are all incorrectly assigned to model_ckpt_meta_file. Correct this by assigning each to its respective variable: model_ckpt_data to model_ckpt_data file, model_ckpt_meta to model_ckpt_meta_file, and model_ckpt_index to model_ckpt_index_file, ensuring each output file path is accurate.

coderabbitai · 2025-06-11T13:35:03Z

tests/mocked_ops.py

+class MockedPrepNvNMDTrain(PrepDPTrain):
+    @OP.exec_sign_check
+    def execute(
+        self,
+        ip: OPIO,
+    ) -> OPIO:
+        template = ip["template_script"]
+        numb_models = ip["numb_models"]
+        ofiles = []
+        osubdirs = []
+
+        assert template == mocked_template_script
+        assert numb_models == mocked_numb_models
+
+        for ii in range(numb_models):
+            jtmp = template
+            jtmp["seed"] = ii
+            subdir = Path(train_task_pattern % ii)
+            subdir.mkdir(exist_ok=True, parents=True)
+            fname = subdir / "input.json"
+            with open(fname, "w") as fp:
+                json.dump(jtmp, fp, indent=4)
+            osubdirs.append(str(subdir))
+            ofiles.append(fname)
+
+        op = OPIO(
+            {
+                "task_names": osubdirs,
+                "task_paths": [Path(ii) for ii in osubdirs],
+            }
+        )
+        return op
+


💡 Verification agent

🧩 Analysis chain

Incorrect inheritance - should inherit from PrepNvNMDTrain.

This class inherits from PrepDPTrain but appears to be mocking PrepNvNMDTrain operations. Additionally, the implementation is identical to MockedPrepDPTrain, which suggests this might need NvNMD-specific behavior.

Consider updating the inheritance:

-class MockedPrepNvNMDTrain(PrepDPTrain): +class MockedPrepNvNMDTrain(PrepNvNMDTrain):

🏁 Script executed:

#!/bin/bash # Check if PrepNvNMDTrain class exists ast-grep --pattern 'class PrepNvNMDTrain'

Length of output: 6575

Incorrect inheritance in MockedPrepNvNMDTrain

The mock is intended to mirror PrepNvNMDTrain but currently inherits from PrepDPTrain, duplicating MockedPrepDPTrain behavior. Please update both the import and the base class:

• tests/mocked_ops.py (around lines 379–411):

- from dpgen2.op.prep_dp_train import PrepDPTrain + from dpgen2.op.prep_nvnmd_train import PrepNvNMDTrain - class MockedPrepNvNMDTrain(PrepDPTrain): + class MockedPrepNvNMDTrain(PrepNvNMDTrain):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class MockedPrepNvNMDTrain(PrepDPTrain):

@OP.exec_sign_check

def execute(

self,

ip: OPIO,

) -> OPIO:

template = ip["template_script"]

numb_models = ip["numb_models"]

ofiles = []

osubdirs = []

assert template == mocked_template_script

assert numb_models == mocked_numb_models

for ii in range(numb_models):

jtmp = template

jtmp["seed"] = ii

subdir = Path(train_task_pattern % ii)

subdir.mkdir(exist_ok=True, parents=True)

fname = subdir / "input.json"

with open(fname, "w") as fp:

json.dump(jtmp, fp, indent=4)

osubdirs.append(str(subdir))

ofiles.append(fname)

op = OPIO(

{

"task_names": osubdirs,

"task_paths": [Path(ii) for ii in osubdirs],

}

)

return op

# tests/mocked_ops.py

-from dpgen2.op.prep_dp_train import PrepDPTrain

+from dpgen2.op.prep_nvnmd_train import PrepNvNMDTrain

-class MockedPrepNvNMDTrain(PrepDPTrain):

+class MockedPrepNvNMDTrain(PrepNvNMDTrain):

@OP.exec_sign_check

def execute(

self,

ip: OPIO,

) -> OPIO:

template = ip["template_script"]

numb_models = ip["numb_models"]

ofiles = []

osubdirs = []

assert template == mocked_template_script

assert numb_models == mocked_numb_models

for ii in range(numb_models):

jtmp = template

jtmp["seed"] = ii

subdir = Path(train_task_pattern % ii)

subdir.mkdir(exist_ok=True, parents=True)

fname = subdir / "input.json"

with open(fname, "w") as fp:

json.dump(jtmp, fp, indent=4)

osubdirs.append(str(subdir))

ofiles.append(fname)

op = OPIO(

{

"task_names": osubdirs,

"task_paths": [Path(ii) for ii in osubdirs],

}

)

return op

🤖 Prompt for AI Agents

In tests/mocked_ops.py around lines 379 to 411, the class MockedPrepNvNMDTrain incorrectly inherits from PrepDPTrain instead of PrepNvNMDTrain, causing it to duplicate MockedPrepDPTrain behavior. To fix this, update the import statement to include PrepNvNMDTrain if not already imported, and change the base class of MockedPrepNvNMDTrain from PrepDPTrain to PrepNvNMDTrain to correctly reflect the intended inheritance and behavior.

coderabbitai · 2025-06-11T13:35:03Z

dpgen2/op/run_nvnmd_train.py

+        with set_directory(work_dir):
+            # open log
+            fplog = open("train.log", "w")
+
+            def clean_before_quit():
+                fplog.close()
+


⚠️ Potential issue

Use context manager for file operations.

Opening files without proper context management can lead to resource leaks.

with set_directory(work_dir): - # open log - fplog = open("train.log", "w") - - def clean_before_quit(): - fplog.close() + with open("train.log", "w") as fplog: + # Rest of the code that uses fplog + # Move all the code that uses fplog inside this context

Note: This requires restructuring the code to ensure all fplog usage is within the context manager. The clean_before_quit function would no longer be needed.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.11.9)

272-272: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py around lines 270 to 276, the file "train.log" is opened without using a context manager, risking resource leaks. Refactor the code to open "train.log" using a with statement so that the file is automatically closed after use. Move all code that uses fplog inside this with block and remove the clean_before_quit function since it will no longer be necessary.

coderabbitai · 2025-06-11T13:35:03Z

dpgen2/op/run_nvnmd_train.py

+    @staticmethod
+    def normalize_config(data={}):
+        ta = RunNvNMDTrain.training_args()
+
+        base = Argument("base", dict, ta)
+        data = base.normalize_value(data, trim_pattern="_*")
+        base.check_value(data, strict=True)
+
+        return data
+


⚠️ Potential issue

Avoid mutable default arguments.

Using mutable default arguments can lead to unexpected behavior.

@staticmethod - def normalize_config(data={}): + def normalize_config(data=None): + if data is None: + data = {} ta = RunNvNMDTrain.training_args() base = Argument("base", dict, ta) data = base.normalize_value(data, trim_pattern="_*") base.check_value(data, strict=True) return data

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@staticmethod

def normalize_config(data={}):

ta = RunNvNMDTrain.training_args()

base = Argument("base", dict, ta)

data = base.normalize_value(data, trim_pattern="_*")

base.check_value(data, strict=True)

return data

@staticmethod

def normalize_config(data=None):

if data is None:

data = {}

ta = RunNvNMDTrain.training_args()

base = Argument("base", dict, ta)

data = base.normalize_value(data, trim_pattern="_*")

base.check_value(data, strict=True)

return data

🧰 Tools

🪛 Ruff (0.11.9)

619-619: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py around lines 618 to 627, the method normalize_config uses a mutable default argument data={} which can cause unexpected behavior. Change the default value of data to None and inside the method, initialize data to an empty dictionary if it is None. This avoids using a mutable default argument.

coderabbitai · 2025-06-11T13:35:04Z

dpgen2/exploration/task/lmp/lmp_input.py

+    if nvnmd_version is not None:
+        ret += 'if "${rerun} > 0" then "jump SELF rerun"\n'
    if pka_e is None:
        ret += 'if "${restart} == 0" then "velocity        all create ${TEMP} %d"' % (


🛠️ Refactor suggestion

rerun variable is never defined in the template

The jump predicates on ${rerun}, but that variable is not declared in the
header (contrast with ${restart}). LAMMPS will abort with
ERROR: Variable rerun is not defined. Either:

+# declare default +variable RERUN equal 0

or document that callers must always pass -var rerun ….

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In dpgen2/exploration/task/lmp/lmp_input.py around lines 152 to 155, the template uses the variable ${rerun} without defining it in the header, causing LAMMPS to abort with an undefined variable error. To fix this, define the rerun variable in the template header similarly to how restart is defined, or clearly document that callers must always provide the rerun variable via the command line using -var rerun. Ensure the variable is properly declared before use to prevent runtime errors.

coderabbitai

Actionable comments posted: 16

🔭 Outside diff range comments (1)

dpgen2/exploration/task/lmp/lmp_input.py (1)

31-56: 🛠️ Refactor suggestion

Add documentation for the new nvnmd_version parameter.

The nvnmd_version parameter was added but not documented in the function signature or docstring. Please add appropriate documentation explaining its purpose and usage.

Would you like me to help add comprehensive documentation for this parameter, including its expected values and effects on the generated LAMMPS input?

🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 31-31: Too many arguments (24/5)

(R0913)

[refactor] 31-31: Too many positional arguments (24/5)

(R0917)

[refactor] 31-31: Too many local variables (36/15)

(R0914)

[refactor] 31-31: Too many branches (35/12)

(R0912)

[refactor] 31-31: Too many statements (111/50)

(R0915)

🧹 Nitpick comments (27)

dpgen2/op/prep_nvnmd_train.py (2)
7-11: Remove unused import Tuple.

Tuple is imported but never used in the file – dead code that increases maintenance cost and trips linters (cf. Ruff F401).
-from typing import (
-    List,
-    Tuple,
-    Union,
-)
+from typing import (
+    List,
+    Union,
+)
🧰 Tools

🪛 Ruff (0.11.9)

9-9: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

82-88: Prefer isinstance over direct type comparison.

type(template) != list bypasses inheritance and is flagged by linters (E721).
-        if type(template) != list:
-            template = [template for ii in range(numb_models)]
+        if not isinstance(template, list):
+            template = [template] * numb_models
🧰 Tools

🪛 Ruff (0.11.9)

82-82: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)
dpgen2/entrypoint/args.py (1)
133-176: doc_config defined but never used; wrong docstring passed to config argument.

Ruff flags doc_config unused. The intention is clearly to attach doc_config to the "config" argument, but doc_numb_models is supplied instead.
-        Argument(
-            "config",
-            dict,
-            RunNvNMDTrain.training_args(),
-            optional=True,
-            default=RunNvNMDTrain.normalize_config({}),
-            doc=doc_numb_models,
-        ),
+        Argument(
+            "config",
+            dict,
+            RunNvNMDTrain.training_args(),
+            optional=True,
+            default=RunNvNMDTrain.normalize_config({}),
+            doc=doc_config,
+        ),
If you decide to keep the current docstring, simply drop the unused doc_config variable to silence the linter.

🧰 Tools

🪛 Ruff (0.11.9)

135-135: Local variable doc_config is assigned to but never used

Remove assignment to unused variable doc_config

(F841)
dpgen2/exploration/task/make_task_group_from_config.py (1)

643-651: Code duplication – common branch between MD and NvNMD.

"lmp-md" and "lmp-nvnmd" share identical logic except for inserting nvnmd_version. Consider extracting a helper to reduce duplication.
tests/op/test_prep_nvnmd_train.py (2)
8-16: Prune unused imports to keep the test lean.

numpy, several dflow.python symbols and dpgen2 from .context aren’t referenced. Removing them speeds import time and silences lint warnings (F401).
-import numpy as np
-from dflow.python import (
-    OP,
-    OPIO,
-    Artifact,
-    OPIOSign,
-)
-from .context import (
-    dpgen2,
-)
+from dflow.python import OPIO
🧰 Tools

🪛 Ruff (0.11.9)

8-8: numpy imported but unused

Remove unused import: numpy

(F401)

10-10: dflow.python.OP imported but unused

Remove unused import

(F401)

12-12: dflow.python.Artifact imported but unused

Remove unused import

(F401)

13-13: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

165-169: Variable op unused – remove or assert something with it.

The execute result is stored but not consulted; either assert on it or drop the assignment.
-                op = self.ptrain.execute(ip)
+                self.ptrain.execute(ip)
🧰 Tools

🪛 Ruff (0.11.9)

167-167: Local variable op is assigned to but never used

Remove assignment to unused variable op

(F841)
dpgen2/superop/block.py (2)
224-272: Consider refactoring to reduce code duplication.

The step creation logic is mostly duplicated between the two branches. Consider extracting common parameters and artifacts to reduce duplication:
-    if isinstance(prep_run_dp_train_op, PrepRunNvNMDTrain):
-        prep_run_dp_train = Step(
-            name + "-prep-run-nvnmd-train",
-            template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_models_ckpt_index": block_steps.inputs.artifacts[
-                    "init_models_ckpt_index"
-                ],
-                "init_models_ckpt_data": block_steps.inputs.artifacts[
-                    "init_models_ckpt_data"
-                ],
-                "init_models_ckpt_meta": block_steps.inputs.artifacts[
-                    "init_models_ckpt_meta"
-                ],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
-            key="--".join(
-                ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
-            ),
-        )
-    else:
-        prep_run_dp_train = Step(
-            name + "-prep-run-dp-train",
-            template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
-            key="--".join(
-                ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
-            ),
-        )
+    # Common parameters for both training types
+    common_parameters = {
+        "block_id": block_steps.inputs.parameters["block_id"],
+        "train_config": block_steps.inputs.parameters["train_config"],
+        "numb_models": block_steps.inputs.parameters["numb_models"],
+        "template_script": block_steps.inputs.parameters["template_script"],
+        "run_optional_parameter": run_dp_train_optional_parameter,
+    }
+    
+    # Common artifacts
+    common_artifacts = {
+        "init_models": block_steps.inputs.artifacts["init_models"],
+        "init_data": block_steps.inputs.artifacts["init_data"],
+        "iter_data": block_steps.inputs.artifacts["iter_data"],
+    }
+    
+    # Determine step name and add checkpoint artifacts if needed
+    if isinstance(prep_run_dp_train_op, PrepRunNvNMDTrain):
+        step_name = name + "-prep-run-nvnmd-train"
+        common_artifacts.update({
+            "init_models_ckpt_index": block_steps.inputs.artifacts["init_models_ckpt_index"],
+            "init_models_ckpt_data": block_steps.inputs.artifacts["init_models_ckpt_data"],
+            "init_models_ckpt_meta": block_steps.inputs.artifacts["init_models_ckpt_meta"],
+        })
+    else:
+        step_name = name + "-prep-run-dp-train"
+    
+    prep_run_dp_train = Step(
+        step_name,
+        template=prep_run_dp_train_op,
+        parameters=common_parameters,
+        artifacts=common_artifacts,
+        key="--".join(["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]),
+    )
224-272: Consider refactoring to reduce code duplication.

The branching logic is functionally correct but contains significant duplication. Consider extracting common elements:
+    # Common parameters for both training types
+    common_parameters = {
+        "block_id": block_steps.inputs.parameters["block_id"],
+        "train_config": block_steps.inputs.parameters["train_config"],
+        "numb_models": block_steps.inputs.parameters["numb_models"],
+        "template_script": block_steps.inputs.parameters["template_script"],
+        "run_optional_parameter": run_dp_train_optional_parameter,
+    }
+    
+    # Common artifacts
+    common_artifacts = {
+        "init_models": block_steps.inputs.artifacts["init_models"],
+        "init_data": block_steps.inputs.artifacts["init_data"],
+        "iter_data": block_steps.inputs.artifacts["iter_data"],
+    }
+    
+    # Common key
+    step_key = "--".join(["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"])
+
     if isinstance(prep_run_dp_train_op, PrepRunNvNMDTrain):
+        # Add NvNMD-specific artifacts
+        nvnmd_artifacts = {
+            "init_models_ckpt_index": block_steps.inputs.artifacts["init_models_ckpt_index"],
+            "init_models_ckpt_data": block_steps.inputs.artifacts["init_models_ckpt_data"],
+            "init_models_ckpt_meta": block_steps.inputs.artifacts["init_models_ckpt_meta"],
+        }
         prep_run_dp_train = Step(
             name + "-prep-run-nvnmd-train",
             template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_models_ckpt_index": block_steps.inputs.artifacts[
-                    "init_models_ckpt_index"
-                ],
-                "init_models_ckpt_data": block_steps.inputs.artifacts[
-                    "init_models_ckpt_data"
-                ],
-                "init_models_ckpt_meta": block_steps.inputs.artifacts[
-                    "init_models_ckpt_meta"
-                ],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
-            key="--".join(
-                ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
-            ),
+            parameters=common_parameters,
+            artifacts={**common_artifacts, **nvnmd_artifacts},
+            key=step_key,
         )
     else:
         prep_run_dp_train = Step(
             name + "-prep-run-dp-train",
             template=prep_run_dp_train_op,
-            parameters={
-                "block_id": block_steps.inputs.parameters["block_id"],
-                "train_config": block_steps.inputs.parameters["train_config"],
-                "numb_models": block_steps.inputs.parameters["numb_models"],
-                "template_script": block_steps.inputs.parameters["template_script"],
-                "run_optional_parameter": run_dp_train_optional_parameter,
-            },
-            artifacts={
-                "init_models": block_steps.inputs.artifacts["init_models"],
-                "init_data": block_steps.inputs.artifacts["init_data"],
-                "iter_data": block_steps.inputs.artifacts["iter_data"],
-            },
-            key="--".join(
-                ["%s" % block_steps.inputs.parameters["block_id"], "prep-run-train"]
-            ),
+            parameters=common_parameters,
+            artifacts=common_artifacts,
+            key=step_key,
         )
dpgen2/flow/dpgen_loop.py (1)
500-500: Fix equality comparisons to True.

Direct comparisons to True are not Pythonic and flagged by static analysis:
     steps.outputs.parameters[
         "exploration_scheduler"
     ].value_from_expression = if_expression(
-        _if=(scheduler_step.outputs.parameters["converged"] == True),
+        _if=scheduler_step.outputs.parameters["converged"],
         _then=scheduler_step.outputs.parameters["exploration_scheduler"],
         _else=next_step.outputs.parameters["exploration_scheduler"],
     )
Apply similar changes to lines 513, 521, and 529.

Also applies to: 513-513, 521-521, 529-529

🧰 Tools

🪛 Ruff (0.11.9)

500-500: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)
tests/mocked_ops.py (4)
436-436: Remove unused variables.

The variables task_id and init_model_str are assigned but never used.
-        task_id = int(ip["task_name"].split(".")[1])
         assert ip["task_name"] in str(ip["task_path"])
         assert "model" in str(ip["init_model"])
         assert ".pb" in str(ip["init_model"])
         script = Path(script).resolve()
         init_model = init_model.resolve()
-        init_model_str = str(init_model)
         init_model_ckpt_meta = init_model_ckpt_meta.resolve()
Also applies to: 448-448

🧰 Tools

🪛 Ruff (0.11.9)

436-436: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

588-588: Remove unused variable.
-        task_id = int(ip["task_name"].split(".")[1])
         assert ip["task_name"] in str(ip["task_path"])
🧰 Tools

🪛 Ruff (0.11.9)

588-588: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

765-765: Remove unused variables.
-        task_id = int(ip["task_name"].split(".")[1])
         assert task_path.is_dir()
         task_path = task_path.resolve()
         models = [ii.resolve() for ii in models]
-        models_str = [str(ii) for ii in models]
Also applies to: 780-780

🧰 Tools

🪛 Ruff (0.11.9)

765-765: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

753-828: Consider refactoring to reduce code duplication.

The MockedRunNvNMD class is nearly identical to MockedRunLmp. Consider extracting common logic into a base class or shared method to improve maintainability.

Would you like me to help create a base class that both MockedRunLmp and MockedRunNvNMD can inherit from to eliminate the code duplication?

🧰 Tools

🪛 Ruff (0.11.9)

765-765: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

780-780: Local variable models_str is assigned to but never used

Remove assignment to unused variable models_str

(F841)

🪛 Pylint (3.3.7)

[refactor] 755-755: Too many local variables (17/15)

(R0914)
dpgen2/superop/prep_run_nvnmd_train.py (1)
1-1: Remove unused imports.

The following imports are not used in the code:

json

Path

Set

Workflow

argo_range

download_artifact

OPIO, Artifact, BigParameter, OPIOSign

train_script_name, train_task_pattern
-import json
 import os
 from copy import (
     deepcopy,
 )
-from pathlib import (
-    Path,
-)
 from typing import (
     List,
     Optional,
-    Set,
     Type,
 )

 from dflow import (
     InputArtifact,
     InputParameter,
     Inputs,
     OutputArtifact,
     OutputParameter,
     Outputs,
     S3Artifact,
     Step,
     Steps,
-    Workflow,
     argo_len,
-    argo_range,
     argo_sequence,
-    download_artifact,
     upload_artifact,
 )
 from dflow.python import (
     OP,
-    OPIO,
-    Artifact,
-    BigParameter,
-    OPIOSign,
     PythonOPTemplate,
     Slices,
 )

 from dpgen2.constants import (
     train_index_pattern,
-    train_script_name,
-    train_task_pattern,
 )
Also applies to: 7-7, 12-12, 26-26, 28-28, 30-30, 35-38, 45-46

🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)
dpgen2/op/run_nvnmd.py (3)
14-15: Remove unused imports.

The following imports are not used:

Set, Tuple from typing

ArgumentEncoder, Variant, dargs from dargs

FatalError, HDF5Datasets from dflow.python

lmp_conf_name from dpgen2.constants
 from typing import (
     List,
     Optional,
-    Set,
-    Tuple,
 )

 from dargs import (
     Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
-    FatalError,
-    HDF5Datasets,
     OPIOSign,
     TransientError,
 )

 from dpgen2.constants import (
-    lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
Also applies to: 24-26, 33-34, 40-40

🧰 Tools

🪛 Ruff (0.11.9)

14-14: typing.Set imported but unused

Remove unused import

(F401)

15-15: typing.Tuple imported but unused

Remove unused import

(F401)

345-345: Remove unnecessary f-string prefix.
-        raise RuntimeError(f"last matching index should not be -1, terribly wrong ")
+        raise RuntimeError("last matching index should not be -1, terribly wrong ")
🧰 Tools

🪛 Ruff (0.11.9)

345-345: f-string without any placeholders

Remove extraneous f prefix

(F541)

97-263: Consider refactoring the execute method to reduce complexity.

The execute method is handling multiple responsibilities and has become quite complex (29 local variables, 63 statements). Consider extracting helper methods for:

Setting up input files and models

Running LAMMPS commands

Post-processing results

This would improve readability and maintainability.

Would you like me to help refactor this method into smaller, more focused helper methods?

🧰 Tools

🪛 Ruff (0.11.9)

155-155: Do not use bare except

(E722)

166-166: Do not use bare except

(E722)

🪛 Pylint (3.3.7)

[refactor] 97-97: Too many local variables (29/15)

(R0914)

[refactor] 97-97: Too many branches (14/12)

(R0912)

[refactor] 97-97: Too many statements (63/50)

(R0915)
tests/op/test_run_nvnmd.py (5)
1-1: Remove unused imports.
-import json
 import os
 import shutil
 import unittest

 from dflow.python import (
-    OP,
     OPIO,
-    Artifact,
-    OPIOSign,
     TransientError,
 )
 from mock import (
     call,
-    mock,
     patch,
 )

 # isort: off
 from .context import (
-    dpgen2,
 )
Also applies to: 12-12, 14-15, 20-20, 26-26

🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)

134-135: Remove unused exception variable.
-        with self.assertRaises(TransientError) as ee:
-            out = op.execute(
+        with self.assertRaises(TransientError):
+            op.execute(
🧰 Tools

🪛 Ruff (0.11.9)

134-134: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

135-135: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

243-243: Remove unused variable.
-        out = op.execute(
+        op.execute(
🧰 Tools

🪛 Ruff (0.11.9)

243-243: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

309-309: Remove unused exception variables.
-        with self.assertRaises(RuntimeError) as re:
+        with self.assertRaises(RuntimeError):
             set_models(input_name, self.model_names)
-        with self.assertRaises(RuntimeError) as re:
+        with self.assertRaises(RuntimeError):
             set_models(input_name, self.model_names)
Also applies to: 317-317

🧰 Tools

🪛 Ruff (0.11.9)

309-309: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

270-270: Remove unnecessary f-string prefix.
-            (work_dir / (model_name_pattern % ii)).read_text(), f"teacher model"
+            (work_dir / (model_name_pattern % ii)).read_text(), "teacher model"
🧰 Tools

🪛 Ruff (0.11.9)

270-270: f-string without any placeholders

Remove extraneous f prefix

(F541)
tests/test_prep_run_nvnmd_train.py (1)
38-45: Simplify the dpgen2 import handling

The try-except block can be simplified since dpgen2 is not actually used. Consider using contextlib.suppress or removing this block entirely if not needed.
-try:
-    from context import (
-        dpgen2,
-    )
-except ModuleNotFoundError:
-    # case of upload everything to argo, no context needed
-    pass
+# dpgen2 import not needed in this test file
🧰 Tools

🪛 Ruff (0.11.9)

38-44: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

40-40: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)
dpgen2/entrypoint/submit.py (1)
211-219: Simplify conditional logic by removing elif after return

Since the previous condition returns, the elif can be replaced with if for cleaner code.
-    elif "nvnmd" in explore_style:
+    if "nvnmd" in explore_style:
         prep_run_explore_op = PrepRunLmp(
             "prep-run-nvnmd",
             PrepLmp,
             RunNvNMD,
             prep_config=prep_explore_config,
             run_config=run_explore_config,
             upload_python_packages=upload_python_packages,
         )
Also applies to: 313-318
dpgen2/op/run_nvnmd_train.py (3)
219-223: Simplify version detection with ternary operator

The if-else block can be simplified using a ternary operator.
-        if "systems" in train_dict["training"]:
-            major_version = "1"
-        else:
-            major_version = "2"
+        major_version = "1" if "systems" in train_dict["training"] else "2"
🧰 Tools

🪛 Ruff (0.11.9)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

481-494: Remove unnecessary else after return

The else block is not needed since the if block returns.
     @staticmethod
     def skip_training(
         work_dir,
         train_dict,
         init_model,
         iter_data,
     ):
         # we have init model and no iter data, skip training
         if (init_model is not None) and (iter_data is None or len(iter_data) == 0):
             with set_directory(work_dir):
                 with open(train_script_name, "w") as fp:
                     json.dump(train_dict, fp, indent=4)
                 Path("train.log").write_text(
                     f"We have init model {init_model} and "
                     f"no iteration training data. "
                     f"The training is skipped.\n"
                 )
                 Path("lcurve.out").touch()
             return True
-        else:
-            return False
+        return False
🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 482-494: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

103-627: Consider refactoring to reduce complexity

The RunNvNMDTrain class implementation is functionally correct but has high complexity:

The execute method has 45 local variables and 83 statements

Consider extracting some logic into separate methods (e.g., training script preparation, CNN training, QNN training)

This would improve maintainability and testability

Would you like me to help refactor this class to reduce complexity by breaking down the execute method into smaller, more focused methods?

🧰 Tools

🪛 Ruff (0.11.9)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

272-272: Use a context manager for opening files

(SIM115)

619-619: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🪛 Pylint (3.3.7)

[refactor] 155-155: Too many local variables (45/15)

(R0914)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 155-155: Too many statements (83/50)

(R0915)

[refactor] 391-391: Too many arguments (7/5)

(R0913)

[refactor] 391-391: Too many positional arguments (7/5)

(R0917)

[refactor] 430-430: Too many branches (14/12)

(R0912)

[refactor] 482-494: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53d93f9 and d7210b9.

📒 Files selected for processing (24)

dpgen2/constants.py (1 hunks)
dpgen2/entrypoint/args.py (3 hunks)
dpgen2/entrypoint/submit.py (9 hunks)
dpgen2/exploration/render/traj_render_lammps.py (2 hunks)
dpgen2/exploration/task/lmp/lmp_input.py (6 hunks)
dpgen2/exploration/task/lmp_template_task_group.py (5 hunks)
dpgen2/exploration/task/make_task_group_from_config.py (4 hunks)
dpgen2/exploration/task/npt_task_group.py (3 hunks)
dpgen2/flow/dpgen_loop.py (8 hunks)
dpgen2/op/__init__.py (2 hunks)
dpgen2/op/prep_nvnmd_train.py (1 hunks)
dpgen2/op/run_nvnmd.py (1 hunks)
dpgen2/op/run_nvnmd_train.py (1 hunks)
dpgen2/superop/__init__.py (1 hunks)
dpgen2/superop/block.py (7 hunks)
dpgen2/superop/prep_run_nvnmd_train.py (1 hunks)
dpgen2/utils/download_dpgen2_artifacts.py (1 hunks)
tests/mocked_ops.py (5 hunks)
tests/op/test_prep_nvnmd_train.py (1 hunks)
tests/op/test_run_nvnmd.py (1 hunks)
tests/op/test_run_nvnmd_train.py (1 hunks)
tests/test_prep_run_nvnmd.py (1 hunks)
tests/test_prep_run_nvnmd_train.py (1 hunks)
tests/utils/test_dl_dpgen2_arti.py (4 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (7)

dpgen2/superop/__init__.py (1)

dpgen2/superop/prep_run_nvnmd_train.py (1)

PrepRunNvNMDTrain (57-153)

dpgen2/op/__init__.py (3)

dpgen2/op/prep_nvnmd_train.py (1)

PrepNvNMDTrain (27-119)

dpgen2/op/run_nvnmd.py (1)

RunNvNMD (62-313)

dpgen2/op/run_nvnmd_train.py (1)

RunNvNMDTrain (103-626)

dpgen2/exploration/render/traj_render_lammps.py (1)

dpgen2/fp/vasp.py (2)

set_ele_temp (103-125)

set_ele_temp (176-188)

tests/op/test_run_nvnmd_train.py (2)

tests/fake_data_set.py (2)

fake_multi_sys (22-32)

fake_system (5-19)

dpgen2/op/run_nvnmd_train.py (8)

RunNvNMDTrain (103-626)

_get_data_size_of_all_mult_sys (650-654)

_make_train_command (51-100)

normalize_config (619-626)

decide_init_model (497-529)

write_data_to_input_script (391-427)

write_other_to_input_script (430-472)

execute (155-388)

dpgen2/superop/block.py (2)

dpgen2/superop/prep_run_nvnmd_train.py (1)

PrepRunNvNMDTrain (57-153)

dpgen2/superop/prep_run_dp_train.py (1)

PrepRunDPTrain (57-146)

dpgen2/entrypoint/submit.py (5)

dpgen2/op/prep_nvnmd_train.py (1)

PrepNvNMDTrain (27-119)

dpgen2/op/run_nvnmd.py (1)

RunNvNMD (62-313)

dpgen2/op/run_nvnmd_train.py (1)

RunNvNMDTrain (103-626)

dpgen2/superop/prep_run_nvnmd_train.py (1)

PrepRunNvNMDTrain (57-153)

dpgen2/utils/artifact_uri.py (1)

get_artifact_from_uri (8-14)

tests/mocked_ops.py (4)

dpgen2/op/run_nvnmd.py (1)

RunNvNMD (62-313)

dpgen2/op/run_nvnmd_train.py (2)

RunNvNMDTrain (103-626)

execute (155-388)

dpgen2/op/prep_dp_train.py (1)

PrepDPTrain (27-133)

dpgen2/utils/chdir.py (1)

chdir (47-73)

🪛 Ruff (0.11.9)

dpgen2/superop/__init__.py

20-20: .prep_run_nvnmd_train.PrepRunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

dpgen2/superop/prep_run_nvnmd_train.py

1-1: json imported but unused

Remove unused import: json

(F401)

7-7: pathlib.Path imported but unused

Remove unused import: pathlib.Path

(F401)

12-12: typing.Set imported but unused

Remove unused import: typing.Set

(F401)

26-26: dflow.Workflow imported but unused

Remove unused import

(F401)

28-28: dflow.argo_range imported but unused

Remove unused import

(F401)

30-30: dflow.download_artifact imported but unused

Remove unused import

(F401)

35-35: dflow.python.OPIO imported but unused

Remove unused import

(F401)

36-36: dflow.python.Artifact imported but unused

Remove unused import

(F401)

37-37: dflow.python.BigParameter imported but unused

Remove unused import

(F401)

38-38: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

45-45: dpgen2.constants.train_script_name imported but unused

Remove unused import

(F401)

46-46: dpgen2.constants.train_task_pattern imported but unused

Remove unused import

(F401)

161-161: Do not perform function call normalize_step_dict in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

162-162: Do not perform function call normalize_step_dict in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

dpgen2/op/__init__.py

26-26: .prep_nvnmd_train.PrepNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

45-45: .run_nvnmd.RunNvNMD imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

48-48: .run_nvnmd_train.RunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

dpgen2/op/prep_nvnmd_train.py

9-9: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

82-82: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

tests/op/test_prep_nvnmd_train.py

8-8: numpy imported but unused

Remove unused import: numpy

(F401)

10-10: dflow.python.OP imported but unused

Remove unused import

(F401)

12-12: dflow.python.Artifact imported but unused

Remove unused import

(F401)

13-13: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

21-21: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

167-167: Local variable op is assigned to but never used

Remove assignment to unused variable op

(F841)

dpgen2/entrypoint/args.py

135-135: Local variable doc_config is assigned to but never used

Remove assignment to unused variable doc_config

(F841)

tests/op/test_run_nvnmd_train.py

1-1: itertools imported but unused

Remove unused import: itertools

(F401)

3-3: os imported but unused

Remove unused import: os

(F401)

10-10: numpy imported but unused

Remove unused import: numpy

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

16-16: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

17-17: dflow.python.TransientError imported but unused

Remove unused import

(F401)

30-30: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

36-36: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

41-41: dpgen2.op.run_nvnmd_train._make_train_command imported but unused

Remove unused import: dpgen2.op.run_nvnmd_train._make_train_command

(F401)

610-610: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

611-611: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

dpgen2/flow/dpgen_loop.py

513-513: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

521-521: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

529-529: Avoid equality comparisons to True; use if scheduler_step.outputs.parameters["converged"]: for truth checks

Replace with scheduler_step.outputs.parameters["converged"]

(E712)

tests/test_prep_run_nvnmd.py

1-1: json imported but unused

Remove unused import: json

(F401)

3-3: pickle imported but unused

Remove unused import: pickle

(F401)

12-12: typing.Set imported but unused

Remove unused import: typing.Set

(F401)

15-15: jsonpickle imported but unused

Remove unused import: jsonpickle

(F401)

16-16: numpy imported but unused

Remove unused import: numpy

(F401)

18-18: dflow.InputArtifact imported but unused

Remove unused import

(F401)

19-19: dflow.InputParameter imported but unused

Remove unused import

(F401)

20-20: dflow.Inputs imported but unused

Remove unused import

(F401)

21-21: dflow.OutputArtifact imported but unused

Remove unused import

(F401)

22-22: dflow.OutputParameter imported but unused

Remove unused import

(F401)

23-23: dflow.Outputs imported but unused

Remove unused import

(F401)

24-24: dflow.S3Artifact imported but unused

Remove unused import

(F401)

26-26: dflow.Steps imported but unused

Remove unused import

(F401)

28-28: dflow.argo_range imported but unused

Remove unused import

(F401)

33-33: dflow.python.OP imported but unused

Remove unused import

(F401)

35-35: dflow.python.Artifact imported but unused

Remove unused import

(F401)

36-36: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

37-37: dflow.python.PythonOPTemplate imported but unused

Remove unused import

(F401)

40-46: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

42-42: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)

66-66: dpgen2.constants.model_name_pattern imported but unused

Remove unused import

(F401)

67-67: dpgen2.constants.train_log_name imported but unused

Remove unused import

(F401)

68-68: dpgen2.constants.train_script_name imported but unused

Remove unused import

(F401)

69-69: dpgen2.constants.train_task_pattern imported but unused

Remove unused import

(F401)

tests/op/test_run_nvnmd.py

1-1: json imported but unused

Remove unused import: json

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

15-15: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

20-20: mock.mock imported but unused

Remove unused import: mock.mock

(F401)

26-26: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

134-134: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

135-135: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

243-243: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

270-270: f-string without any placeholders

Remove extraneous f prefix

(F541)

309-309: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

317-317: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

dpgen2/op/run_nvnmd.py

14-14: typing.Set imported but unused

Remove unused import

(F401)

15-15: typing.Tuple imported but unused

Remove unused import

(F401)

24-24: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

25-25: dargs.Variant imported but unused

Remove unused import

(F401)

26-26: dargs.dargs imported but unused

Remove unused import

(F401)

33-33: dflow.python.FatalError imported but unused

Remove unused import

(F401)

34-34: dflow.python.HDF5Datasets imported but unused

Remove unused import

(F401)

40-40: dpgen2.constants.lmp_conf_name imported but unused

Remove unused import: dpgen2.constants.lmp_conf_name

(F401)

155-155: Do not use bare except

(E722)

166-166: Do not use bare except

(E722)

308-308: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

345-345: f-string without any placeholders

Remove extraneous f prefix

(F541)

dpgen2/op/run_nvnmd_train.py

14-14: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

21-21: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

22-22: dargs.Variant imported but unused

Remove unused import

(F401)

23-23: dargs.dargs imported but unused

Remove unused import

(F401)

34-34: dflow.python.TransientError imported but unused

Remove unused import: dflow.python.TransientError

(F401)

41-41: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

219-222: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

272-272: Use a context manager for opening files

(SIM115)

619-619: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

tests/mocked_ops.py

436-436: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

448-448: Local variable init_model_str is assigned to but never used

Remove assignment to unused variable init_model_str

(F841)

588-588: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

765-765: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

780-780: Local variable models_str is assigned to but never used

Remove assignment to unused variable models_str

(F841)

tests/test_prep_run_nvnmd_train.py

1-1: json imported but unused

Remove unused import: json

(F401)

10-10: typing.List imported but unused

Remove unused import

(F401)

11-11: typing.Set imported but unused

Remove unused import

(F401)

14-14: numpy imported but unused

Remove unused import: numpy

(F401)

16-16: dflow.InputArtifact imported but unused

Remove unused import

(F401)

17-17: dflow.InputParameter imported but unused

Remove unused import

(F401)

18-18: dflow.Inputs imported but unused

Remove unused import

(F401)

19-19: dflow.OutputArtifact imported but unused

Remove unused import

(F401)

20-20: dflow.OutputParameter imported but unused

Remove unused import

(F401)

21-21: dflow.Outputs imported but unused

Remove unused import

(F401)

22-22: dflow.S3Artifact imported but unused

Remove unused import

(F401)

24-24: dflow.Steps imported but unused

Remove unused import

(F401)

26-26: dflow.argo_range imported but unused

Remove unused import

(F401)

31-31: dflow.python.OP imported but unused

Remove unused import

(F401)

33-33: dflow.python.Artifact imported but unused

Remove unused import

(F401)

34-34: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

35-35: dflow.python.PythonOPTemplate imported but unused

Remove unused import

(F401)

38-44: Use contextlib.suppress(ModuleNotFoundError) instead of try-except-pass

(SIM105)

40-40: context.dpgen2 imported but unused; consider using importlib.util.find_spec to test for availability

(F401)

101-101: Do not assign a lambda expression, use a def

Rewrite revised_fname as a def

(E731)

🪛 Pylint (3.3.7)

dpgen2/superop/prep_run_nvnmd_train.py

[refactor] 58-58: Too many arguments (9/5)

(R0913)

[refactor] 58-58: Too many positional arguments (9/5)

(R0917)

[refactor] 156-156: Too many arguments (9/5)

(R0913)

[refactor] 156-156: Too many positional arguments (9/5)

(R0917)

[refactor] 156-156: Too many local variables (16/15)

(R0914)

tests/op/test_prep_nvnmd_train.py

[refactor] 55-55: Too few public methods (1/2)

(R0903)

tests/op/test_run_nvnmd_train.py

[refactor] 47-47: Too many instance attributes (27/7)

(R0902)

[refactor] 638-638: Too many instance attributes (15/7)

(R0902)

dpgen2/entrypoint/submit.py

[refactor] 313-318: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

dpgen2/op/run_nvnmd.py

[refactor] 97-97: Too many local variables (29/15)

(R0914)

[refactor] 97-97: Too many branches (14/12)

(R0912)

[refactor] 97-97: Too many statements (63/50)

(R0915)

[refactor] 369-372: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)

[refactor] 438-438: Too many local variables (19/15)

(R0914)

dpgen2/op/run_nvnmd_train.py

[refactor] 155-155: Too many local variables (45/15)

(R0914)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 155-155: Too many statements (83/50)

(R0915)

[refactor] 391-391: Too many arguments (7/5)

(R0913)

[refactor] 391-391: Too many positional arguments (7/5)

(R0917)

[refactor] 430-430: Too many branches (14/12)

(R0912)

[refactor] 482-494: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

tests/mocked_ops.py

[refactor] 415-415: Too many local variables (32/15)

(R0914)

[refactor] 415-415: Too many statements (100/50)

(R0915)

[refactor] 567-567: Too many local variables (27/15)

(R0914)

[refactor] 567-567: Too many statements (76/50)

(R0915)

[refactor] 755-755: Too many local variables (17/15)

(R0914)

tests/test_prep_run_nvnmd_train.py

[refactor] 80-80: Too many arguments (9/5)

(R0913)

[refactor] 80-80: Too many positional arguments (9/5)

(R0917)

[refactor] 198-198: Too many arguments (8/5)

(R0913)

[refactor] 198-198: Too many positional arguments (8/5)

(R0917)

[refactor] 269-269: Too many instance attributes (9/7)

(R0902)

[refactor] 371-371: Too many instance attributes (15/7)

(R0902)

🔇 Additional comments (26)

dpgen2/superop/__init__.py (1)

19-21: LGTM! The import follows the established pattern.

The addition of PrepRunNvNMDTrain import is consistent with the module's design pattern of exposing operator classes through __init__.py. The static analysis warning about unused import is a false positive - this import makes the class available as part of the package's public API, similar to the other operator imports in this file.

🧰 Tools

🪛 Ruff (0.11.9)

20-20: .prep_run_nvnmd_train.PrepRunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

dpgen2/utils/download_dpgen2_artifacts.py (1)

57-59: Checkpoint artifact support added correctly.

The addition of checkpoint artifact keys (init_models_ckpt_meta, init_models_ckpt_data, init_models_ckpt_index as inputs and models_ckpt_meta, models_ckpt_data, models_ckpt_index as outputs) properly extends the download configuration for the "prep-run-train" step. This follows the established pattern and enables downloading of TensorFlow-style checkpoint files needed for NvNMD training workflows.

Also applies to: 64-66

dpgen2/op/__init__.py (1)

25-27: NvNMD operator imports added correctly.

The three new operator imports (PrepNvNMDTrain, RunNvNMD, and RunNvNMDTrain) follow the established pattern of exposing operator classes through the package's __init__.py. The static analysis warnings about unused imports are false positives - these imports make the NvNMD operators available as part of the dpgen2.op public API.

Also applies to: 44-49

🧰 Tools

🪛 Ruff (0.11.9)

26-26: .prep_nvnmd_train.PrepNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

tests/utils/test_dl_dpgen2_arti.py (1)

69-83: Test expectations updated correctly for checkpoint artifacts.

The test updates properly add mock call expectations for the new checkpoint artifacts in both test_train_download and test_update_finished_steps_none_steps. The added expectations match the checkpoint artifact keys introduced in the download configuration, ensuring comprehensive test coverage for the NvNMD checkpoint handling.

Also applies to: 104-118, 300-314, 335-349

dpgen2/exploration/render/traj_render_lammps.py (1)

129-138: Significant refactoring of trajectory processing - verify performance impact.

This change fundamentally alters how trajectories are processed, moving from bulk extraction via sub_system() to creating individual dpdata.System objects for each selected frame. While this enables per-frame customization (nopbc and electronic temperature), it may impact performance when processing many frames.

Was this refactoring specifically required for NvNMD support? Please verify that the performance impact is acceptable for typical use cases with large trajectories.

dpgen2/constants.py (1)

9-12: Checkpoint filename patterns look off – double “model.ckpt” & path separator.

Typical TF checkpoints are model.ckpt-005.meta, model.ckpt-005.index, etc.
Patterns like model.ckpt.%03d/model.ckpt.meta create a directory then duplicate the filename, yielding e.g. model.ckpt.005/model.ckpt.meta. Verify that downstream code really expects this layout.

dpgen2/superop/block.py (6)

67-69: LGTM!

The import is properly organized with other superop imports.

94-94: Type annotation correctly extended.

Good use of Union type to support both training operations while maintaining backward compatibility.

119-121: Checkpoint artifacts properly added.

The optional checkpoint artifacts correctly follow TensorFlow checkpoint file conventions and maintain backward compatibility.

Also applies to: 130-132

67-69: LGTM!

The import follows the existing pattern and is properly organized with other similar imports.

94-94: Well-structured extension for NvNMD support.

The type annotation and new checkpoint artifacts are properly integrated:

Correct use of Union type for supporting both training operations

Checkpoint artifacts appropriately marked as optional

Consistent naming convention maintained

Also applies to: 119-121, 130-132

285-287: Correct handling of operation-specific artifacts.

The conditional logic properly handles:

Different model artifact names ("nvnmodels" vs "models")

Checkpoint artifact linking only for NvNMD operations

The implementation is consistent and follows the established pattern.

Also applies to: 366-375

dpgen2/exploration/task/npt_task_group.py (1)

52-52: LGTM!

The nvnmd_version parameter is properly added with correct typing and backward compatibility.

Also applies to: 77-77, 137-137

dpgen2/exploration/task/lmp_template_task_group.py (3)

50-50: LGTM!

The nvnmd_version parameter is properly added and propagated through the template revision functions.

Also applies to: 57-57, 67-67, 73-73, 75-76

179-192: Conditional logic correctly implements NVNMD-specific changes.

The modifications properly handle:

Switching to pair_style nvnmd with fixed model file

Adding force components to dump output

Inserting rerun jump command

Also applies to: 202-210

223-229: LGTM!

The rerun function correctly adds LAMMPS commands for trajectory reprocessing with force data.

dpgen2/flow/dpgen_loop.py (1)

189-191: Checkpoint artifacts properly integrated.

The optional checkpoint artifacts are consistently added to both ConcurrentLearningLoop and ConcurrentLearning classes.

Also applies to: 200-202, 287-289, 297-299

tests/mocked_ops.py (3)

43-46: LGTM!

The new constants for model checkpoint patterns are correctly imported and consistent with the checkpoint handling introduced in the broader NvNMD support.

102-107: LGTM!

The imports for NvNMD operators are correctly placed and necessary for the mock implementations.

128-141: LGTM!

The function correctly creates mock checkpoint files following the established patterns and structure, properly supporting the NvNMD checkpoint handling.

tests/test_prep_run_nvnmd_train.py (1)

239-481: Well-structured test implementation!

The test classes properly cover unit and integration testing for NvNMD training functionality:

Appropriate setup/teardown methods for resource management

Clear test method names and assertions

Good separation between mocked unit tests and integration tests

Proper handling of dflow artifacts in the integration test

🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 269-269: Too many instance attributes (9/7)

(R0902)

[refactor] 371-371: Too many instance attributes (15/7)

(R0902)

dpgen2/entrypoint/submit.py (4)

108-108: Properly organized NvNMD imports

The new NvNMD-related imports are correctly placed and follow the existing import organization pattern.

Also applies to: 115-116, 131-131

189-199: Correct implementation of dp-nvnmd training style

The new training style follows the established pattern and properly initializes PrepRunNvNMDTrain with all required parameters.

535-543: Proper validation of init model paths

The validation correctly ensures that the number of provided init models matches the expected numb_models parameter, maintaining consistency with the dp training style.

702-704: Checkpoint artifacts properly integrated

The three checkpoint artifacts (meta, index, data) are correctly added to support NvNMD training requirements, following the existing artifact pattern.

tests/test_prep_run_nvnmd.py (1)

122-307: Well-implemented test suite for NvNMD operations

The test classes properly cover:

LAMMPS task preparation with file content verification

Mocked NvNMD run operations with output validation

Integration testing with workflow submission and artifact handling

Proper resource cleanup in tearDown methods

coderabbitai · 2025-06-11T13:44:32Z

dpgen2/op/prep_nvnmd_train.py

+    def _script_rand_seed(
+        self,
+        input_dict,
+    ):
+        jtmp = input_dict.copy()
+
+        # the key "seed" in "nvnmd" is used to set the random seed for the network parameters, it is developing.
+        jtmp["nvnmd"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        jtmp["training"]["seed"] = random.randrange(sys.maxsize) % (2**32)
+        return jtmp


⚠️ Potential issue

Shallow‐copy mutates the caller’s object — use deepcopy.

input_dict.copy() copies only the first level. Updating nested keys (jtmp["nvnmd"]["seed"] & jtmp["training"]["seed"]) mutates the original template that the caller may re-use later, breaking referential transparency.

+import copy ... - jtmp = input_dict.copy() + jtmp = copy.deepcopy(input_dict)

Also consider guarding against missing "nvnmd" / "training" keys to avoid KeyError.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _script_rand_seed(

self,

input_dict,

):

jtmp = input_dict.copy()

# the key "seed" in "nvnmd" is used to set the random seed for the network parameters, it is developing.

jtmp["nvnmd"]["seed"] = random.randrange(sys.maxsize) % (2**32)

jtmp["training"]["seed"] = random.randrange(sys.maxsize) % (2**32)

return jtmp

import copy

def _script_rand_seed(

self,

input_dict,

):

jtmp = copy.deepcopy(input_dict)

# the key "seed" in "nvnmd" is used to set the random seed for the network parameters, it is developing.

jtmp["nvnmd"]["seed"] = random.randrange(sys.maxsize) % (2**32)

jtmp["training"]["seed"] = random.randrange(sys.maxsize) % (2**32)

return jtmp

🤖 Prompt for AI Agents

In dpgen2/op/prep_nvnmd_train.py around lines 110 to 119, replace the shallow copy of input_dict with a deep copy to prevent mutations of nested dictionaries from affecting the original input. Use the deepcopy function from the copy module to create jtmp. Additionally, add checks to ensure the keys "nvnmd" and "training" exist in jtmp before setting their "seed" values to avoid KeyError exceptions.

coderabbitai · 2025-06-11T13:44:32Z

tests/op/test_run_nvnmd_train.py

@@ -0,0 +1,796 @@
+import itertools


⚠️ Potential issue

Remove unused imports.

The following imports are not used in the test file and should be removed:

-import itertools import json -import os import shutil import unittest from pathlib import ( Path, ) -import numpy as np from dflow.python import ( - OP, OPIO, - Artifact, FatalError, - OPIOSign, - TransientError, ) from fake_data_set import ( fake_multi_sys, fake_system, ) from mock import ( call, patch, ) # isort: off from .context import ( - dpgen2, ) from dpgen2.constants import ( train_script_name, train_cnn_script_name, train_qnn_script_name, - train_task_pattern, ) from dpgen2.op.run_nvnmd_train import ( RunNvNMDTrain, _get_data_size_of_all_mult_sys, - _make_train_command, )

Also applies to: 3-3, 10-10, 12-12, 14-14, 16-17, 30-30, 36-36, 41-41

🧰 Tools

🪛 Ruff (0.11.9)

1-1: itertools imported but unused

Remove unused import: itertools

(F401)

🤖 Prompt for AI Agents

In tests/op/test_run_nvnmd_train.py at lines 1, 3, 10, 12, 14, 16-17, 30, 36, and 41, there are unused import statements. Identify and remove all import statements that are not used anywhere in the file to clean up the code and improve readability.

coderabbitai · 2025-06-11T13:44:32Z

tests/op/test_run_nvnmd_train.py

+        with self.assertRaises(FatalError) as ee:
+            out = ptrain.execute(
+                OPIO(
+                    {
+                        "config": config,
+                        "task_name": task_name,
+                        "task_path": Path(task_path),
+                        "init_model": Path(self.init_model),
+                        "init_model_ckpt_meta": Path(self.init_model_ckpt_meta),
+                        "init_model_ckpt_data": Path(self.init_model_ckpt_data),
+                        "init_model_ckpt_index": Path(self.init_model_ckpt_index),
+                        "init_data": [Path(ii) for ii in self.init_data],
+                        "iter_data": [Path(ii) for ii in self.iter_data],
+                    }
+                )
+            )


⚠️ Potential issue

Remove unused variables in exception handling.

The variables ee and out are assigned but never used:

ptrain = RunNvNMDTrain() - with self.assertRaises(FatalError) as ee: - out = ptrain.execute( + with self.assertRaises(FatalError): + ptrain.execute( OPIO( { "config": config, "task_name": task_name, "task_path": Path(task_path), "init_model": Path(self.init_model), "init_model_ckpt_meta": Path(self.init_model_ckpt_meta), "init_model_ckpt_data": Path(self.init_model_ckpt_data), "init_model_ckpt_index": Path(self.init_model_ckpt_index), "init_data": [Path(ii) for ii in self.init_data], "iter_data": [Path(ii) for ii in self.iter_data], } ) )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with self.assertRaises(FatalError) as ee:

out = ptrain.execute(

OPIO(

{

"config": config,

"task_name": task_name,

"task_path": Path(task_path),

"init_model": Path(self.init_model),

"init_model_ckpt_meta": Path(self.init_model_ckpt_meta),

"init_model_ckpt_data": Path(self.init_model_ckpt_data),

"init_model_ckpt_index": Path(self.init_model_ckpt_index),

"init_data": [Path(ii) for ii in self.init_data],

"iter_data": [Path(ii) for ii in self.iter_data],

}

)

)

ptrain = RunNvNMDTrain()

- with self.assertRaises(FatalError) as ee:

- out = ptrain.execute(

+ with self.assertRaises(FatalError):

+ ptrain.execute(

OPIO(

{

"config": config,

"task_name": task_name,

"task_path": Path(task_path),

"init_model": Path(self.init_model),

"init_model_ckpt_meta": Path(self.init_model_ckpt_meta),

"init_model_ckpt_data": Path(self.init_model_ckpt_data),

"init_model_ckpt_index": Path(self.init_model_ckpt_index),

"init_data": [Path(ii) for ii in self.init_data],

"iter_data": [Path(ii) for ii in self.iter_data],

}

)

)

🧰 Tools

🪛 Ruff (0.11.9)

610-610: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

611-611: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

🤖 Prompt for AI Agents

In tests/op/test_run_nvnmd_train.py around lines 610 to 625, the variables 'ee' and 'out' are assigned in the exception handling block but never used. Remove the assignment to 'ee' in the 'with self.assertRaises(FatalError)' statement and avoid assigning the result of 'ptrain.execute' to 'out' since it is not used. This will clean up the code by eliminating unused variables.

coderabbitai · 2025-06-11T13:44:32Z

dpgen2/flow/dpgen_loop.py

+    if (
+        hasattr(block_step.outputs.artifacts["models_ckpt_meta"]._from, "path")
+        and hasattr(block_step.outputs.artifacts["models_ckpt_data"]._from, "path")
+        and hasattr(block_step.outputs.artifacts["models_ckpt_index"]._from, "path")
+    ):


💡 Verification agent

🧩 Analysis chain

Consider a more robust check for artifact availability.

The current hasattr(artifact._from, "path") checks seem fragile and implementation-dependent. Consider using a more explicit method to determine artifact availability.

Can you verify if there's a more robust API method to check if an artifact is available/connected? The current approach relies on internal implementation details.

Also applies to: 509-511

🏁 Script executed:

#!/bin/bash # Search for artifact availability checking patterns in the codebase rg -A 5 "hasattr.*_from.*path" --type py # Look for artifact-related methods that might provide better APIs ast-grep --pattern 'class $_ { $$$ def $_($$$) { $$$ artifact $$$ } $$$ }'

Length of output: 3468

🏁 Script executed:

#!/bin/bash # Locate all class definitions related to artifacts to inspect available methods rg -n "class .*Artifact" --type py # List all Python files whose names include 'artifact' fd artifact -t f -e py

Length of output: 350

🏁 Script executed:

#!/bin/bash # Show class and method definitions in artifact-related utils echo "===== obj_artifact.py =====" rg -n "class " dpgen2/utils/obj_artifact.py rg -n "def " dpgen2/utils/obj_artifact.py echo -e "\n===== artifact_uri.py =====" rg -n "class " dpgen2/utils/artifact_uri.py rg -n "def " dpgen2/utils/artifact_uri.py echo -e "\n===== download_dpgen2_artifacts.py =====" rg -n "class " dpgen2/utils/download_dpgen2_artifacts.py rg -n "def " dpgen2/utils/download_dpgen2_artifacts.py

Length of output: 1614

No existing API to check artifact availability – introduce a helper method

I inspected dpgen2/utils and the core dpgen2/flow modules and didn’t find a built-in call on the Artifact or StepOutputs objects that reliably signals “artifact is ready.” Relying on brittle hasattr(..., "path") checks tightly couples to internal implementation.

• Files/lines to refactor:

dpgen2/flow/dpgen_loop.py around lines 458–462, 509–511, and 633–644.

• Suggested approach:

Add an explicit method on the Artifact (or StepOutputs) class, e.g.
class Artifact: ... def is_available(self) -> bool: return hasattr(self._from, "path") and self._from.path is not None

Replace all hasattr(artifact._from, "path") checks with artifact.is_available().

This will decouple callers from internal attributes, centralize the logic, and make intent clearer.

🤖 Prompt for AI Agents

In dpgen2/flow/dpgen_loop.py around lines 458-462, 509-511, and 633-644, the current checks for artifact availability use fragile hasattr calls on internal _from.path attributes. To fix this, define a new method is_available() on the Artifact class that returns True if the artifact's _from attribute has a non-None path, encapsulating the check. Then replace all existing hasattr(artifact._from, "path") checks with calls to artifact.is_available() to improve robustness and clarity.

tests/mocked_ops.py

tests/test_prep_run_nvnmd.py

coderabbitai · 2025-06-11T13:44:33Z

dpgen2/op/run_nvnmd_train.py

+    Dict,
+    List,
+    Optional,
+    Tuple,


🛠️ Refactor suggestion

Remove unused imports

Several imports are not used and should be removed:

Tuple from typing (line 14)

ArgumentEncoder, Variant, dargs from dargs (lines 21-23)

TransientError from dflow.python (line 34)

train_task_pattern from constants (line 41)

from typing import ( Dict, List, Optional, - Tuple, Union, ) from dargs import ( Argument, - ArgumentEncoder, - Variant, - dargs, ) from dflow.python import ( OP, OPIO, Artifact, BigParameter, FatalError, NestedDict, OPIOSign, Parameter, - TransientError, ) from dpgen2.constants import ( train_cnn_script_name, train_qnn_script_name, train_script_name, - train_task_pattern, )

Also applies to: 21-23, 34-34, 41-41

🧰 Tools

🪛 Ruff (0.11.9)

14-14: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py at lines 14, 21-23, 34, and 41, remove the unused imports Tuple, ArgumentEncoder, Variant, dargs, TransientError, and train_task_pattern respectively. This cleanup involves deleting these import statements to keep the codebase clean and avoid unnecessary dependencies.

coderabbitai · 2025-06-11T13:44:34Z

dpgen2/op/run_nvnmd_train.py

+        mixed_type = ip["optional_parameter"]["mixed_type"]
+        config = ip["config"] if ip["config"] is not None else {}
+        dp_command = ip["config"].get("command", "dp").split()
+        train_args = config.get("train_args", "")


⚠️ Potential issue

Remove unused variable assignment

The train_args variable is assigned but never used in this scope.

- train_args = config.get("train_args", "") config = RunNvNMDTrain.normalize_config(config)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

train_args = config.get("train_args", "")

config = RunNvNMDTrain.normalize_config(config)

🧰 Tools

🪛 Ruff (0.11.9)

199-199: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py at line 199, the variable train_args is assigned but never used. Remove the assignment statement for train_args to clean up the code and avoid unused variable warnings.

coderabbitai · 2025-06-11T13:44:34Z

dpgen2/op/run_nvnmd_train.py

+        with set_directory(work_dir):
+            # open log
+            fplog = open("train.log", "w")
+
+            def clean_before_quit():
+                fplog.close()
+


⚠️ Potential issue

Use context manager for file operations

The log file should be opened using a context manager to ensure proper resource cleanup.

with set_directory(work_dir): - # open log - fplog = open("train.log", "w") - - def clean_before_quit(): - fplog.close() + # Write to log file using context manager + log_content = [] + + def add_to_log(content): + log_content.append(content)

Then update all fplog.write() calls to use add_to_log() and write the log at the end:

- clean_before_quit() + # Write all log content + with open("train.log", "w") as fplog: + fplog.write("".join(log_content))

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.11.9)

272-272: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 272-272: Consider using 'with' for resource-allocating operations

(R1732)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py around lines 270 to 276, the log file is opened without a context manager, risking resource leaks. Refactor the code to open "train.log" using a with statement to ensure it is properly closed. Replace all direct fplog.write() calls with an add_to_log() function that accumulates log messages, then write the accumulated log to the file at the end within the context manager block.

coderabbitai · 2025-06-11T13:44:34Z

dpgen2/op/run_nvnmd_train.py

+    def normalize_config(data={}):
+        ta = RunNvNMDTrain.training_args()
+
+        base = Argument("base", dict, ta)
+        data = base.normalize_value(data, trim_pattern="_*")
+        base.check_value(data, strict=True)
+
+        return data


⚠️ Potential issue

Fix mutable default argument

Using mutable default arguments can cause unexpected behavior. Use None and initialize within the function.

@staticmethod - def normalize_config(data={}): + def normalize_config(data=None): + if data is None: + data = {} ta = RunNvNMDTrain.training_args() base = Argument("base", dict, ta) data = base.normalize_value(data, trim_pattern="_*") base.check_value(data, strict=True) return data

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def normalize_config(data={}):

ta = RunNvNMDTrain.training_args()

base = Argument("base", dict, ta)

data = base.normalize_value(data, trim_pattern="_*")

base.check_value(data, strict=True)

return data

@staticmethod

def normalize_config(data=None):

if data is None:

data = {}

ta = RunNvNMDTrain.training_args()

base = Argument("base", dict, ta)

data = base.normalize_value(data, trim_pattern="_*")

base.check_value(data, strict=True)

return data

🧰 Tools

🪛 Ruff (0.11.9)

619-619: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🤖 Prompt for AI Agents

In dpgen2/op/run_nvnmd_train.py around lines 619 to 626, the function normalize_config uses a mutable default argument data={} which can lead to unexpected behavior. Change the default value of data to None and inside the function initialize data to an empty dictionary if it is None before proceeding with the rest of the code.

zjgemi · 2025-06-16T07:55:55Z

dpgen2/entrypoint/submit.py

        )
        config["train"]["numb_models"] = 1
+
+    elif train_style == "dp-nvnmd":


Why not reuse the logic for train_style == "dp"?

zjgemi · 2025-06-16T07:58:36Z

dpgen2/entrypoint/submit.py

        init_models = get_artifact_from_uri(config["train"]["init_models_uri"])
    elif train_style == "dp-dist" and config["train"]["student_model_uri"] is not None:
        init_models = get_artifact_from_uri(config["train"]["student_model_uri"])
+    elif train_style == "dp-nvnmd" and config["train"]["init_models_uri"] is not None:


Same as above.

zjgemi · 2025-06-16T08:02:22Z

dpgen2/exploration/render/traj_render_lammps.py


 import dpdata
 import numpy as np
+from ase.io import (


Is it necessary to introduce the ASE package? If so, add it to dependencies.

zjgemi · 2025-06-16T08:08:12Z

dpgen2/flow/dpgen_loop.py

-        },
-        when="%s == false" % (scheduler_step.outputs.parameters["converged"]),
-    )
+    if (


Not yet understand.

zjgemi · 2025-06-16T08:17:55Z

Is it possible to reuse existing OPs and modify on them, rather than writing new OPs? e.g. PrepNvNMDTrain, RunNvNMDTrain, RunNvNMD, PrepRunNvNMDTrain, etc.

reuse PrepRunDPTrain and PrepRunLmp superop

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (7)

dpgen2/exploration/render/traj_render_lammps.py (1)

131-133: Fix the critical StringIO handling bug.

This is the same issue identified in previous reviews that remains unfixed. Converting StringIO to string with str(traj) passes a string representation ("<_io.StringIO...>") instead of file content, causing ASE's read() to fail with "No such file or directory".

Apply this fix as previously suggested:
-                ss = read(
-                    str(traj), format="lammps-dump-text", index=":", specorder=type_map
-                )
+                ss = read(
+                    traj,  # pass the file-object directly
+                    format="lammps-dump-text", 
+                    index=":", 
+                    specorder=type_map
+                )

tests/op/test_run_nvnmd_train.py (2)

1-44: Remove unused imports.

Multiple imports are not used in this test file and should be removed to keep the code clean.

-import itertools
 import json
-import os
 import shutil
 import unittest
 from pathlib import (
     Path,
 )

-import numpy as np
 from dflow.python import (
-    OP,
     OPIO,
-    Artifact,
     FatalError,
-    OPIOSign,
-    TransientError,
 )
 from fake_data_set import (
     fake_multi_sys,
     fake_system,
 )
 from mock import (
     call,
     patch,
 )

 # isort: off
-from .context import (
-    dpgen2,
-)
 from dpgen2.constants import (
     train_script_name,
     train_cnn_script_name,
     train_qnn_script_name,
-    train_task_pattern,
 )
 from dpgen2.op.run_dp_train import (
     RunDPTrain,
-    _get_data_size_of_all_mult_sys,
 )
 from dpgen2.op.run_nvnmd_train import (
     RunNvNMDTrain,
-    _make_train_command,
 )

642-655: Remove unused variables in exception handling.

The variables ee and out are assigned but never used in this exception handling block.

         ptrain = RunNvNMDTrain()
-        with self.assertRaises(FatalError) as ee:
-            out = ptrain.execute(
+        with self.assertRaises(FatalError):
+            ptrain.execute(
                 OPIO(
                     {
                         "config": config,
                         "task_name": task_name,
                         "task_path": Path(task_path),
                         "init_model": Path(self.init_model),
                         "init_data": [Path(ii) for ii in self.init_data],
                         "iter_data": [Path(ii) for ii in self.iter_data],
                     }
                 )
             )

dpgen2/op/run_nvnmd.py (1)

145-149: Replace bare except with specific exception handling.

Also applies to: 157-163

dpgen2/op/run_nvnmd_train.py (2)

261-267: Use context manager for file operations.

187-187: Remove unused variable train_args.

This variable is assigned from the config but never used in the method.
 config = ip["config"] if ip["config"] is not None else {}
 dp_command = ip["config"].get("command", "dp").split()
-train_args = config.get("train_args", "")
 config = RunDPTrain.normalize_config(config)

tests/mocked_ops.py (1)

549-564: Critical: Wrong inheritance and method call.

🧹 Nitpick comments (17)

dpgen2/exploration/render/traj_render_lammps.py (1)
112-114: Fix import formatting issues.

The static analysis correctly identifies formatting problems with the ASE import.

Apply this formatting fix:
-        from ase.io import (  # type: ignore
-            read,
-        )
+        from ase.io import (  # type: ignore
+            read,
+        )
tests/test_prep_run_lmp.py (1)

222-291: LGTM! New test class properly mirrors existing structure.

The TestMockedRunNvNMD class correctly follows the established pattern from TestMockedRunLmp, ensuring consistent testing of the new NVNMD operator.

However, consider extracting common test logic into a base class or helper methods to reduce code duplication between TestMockedRunLmp and TestMockedRunNvNMD.
tests/op/test_run_nvnmd.py (2)
72-155: Test logic is sound but clean up unused variables.

The test methods properly verify the RunNvNMD operator functionality, including:

Success case with correct command calls and file operations

Error handling with TransientError

Proper setup and teardown

However, remove unused variables:

ee in line 124 (assertRaises context)

out in line 125 (unused assignment)

models in line 136 (duplicate assignment)

Apply this diff to clean up:
-        with self.assertRaises(TransientError) as ee:
-            out = op.execute(
+        with self.assertRaises(TransientError):
+            op.execute(
                 OPIO(
                     {
                         "config": {"command": "mylmp"},
                         "task_name": self.task_name,
                         "task_path": self.task_path,
                         "models": self.models,
                     }
                 )
             )
         # check call
-        models = ["models/path/model_%d.pb" % i for i in range(len(self.models))]
         calls = [
164-194: Test coverage for set_models function is comprehensive.

The TestSetModels class thoroughly tests the set_models helper function with:

Success case for proper nvnmd pair_style

Error cases for incorrect configurations

However, remove unused variables:

re in lines 185 and 193 (assertRaises context)

Apply this diff:
-        with self.assertRaises(RuntimeError) as re:
+        with self.assertRaises(RuntimeError):
             set_models(input_name, self.model_names)

-        with self.assertRaises(RuntimeError) as re:
+        with self.assertRaises(RuntimeError):
             set_models(input_name, self.model_names)
tests/op/test_run_nvnmd_train.py (1)
84-88: Remove commented-out code.

This commented code appears to be from an old implementation and should be removed.
-        # self.init_model = Path("bar.pb")
-        # self.init_model_ckpt_meta = Path("model.ckpt.meta")
-        # self.init_model_ckpt_data = Path("model.ckpt.data")
-        # self.init_model_ckpt_index = Path("model.ckpt.index")
         self.init_model = Path("nvnmd_models")
tests/test_prep_run_dp_train.py (2)
151-152: Remove debug print statement.

This print statement appears to be debug code that should be removed from the test.
     for ii in range(len(mlines)):
-        print(flines[ii + 1], mlines[ii])
         tcase.assertEqual(flines[ii + 1], mlines[ii])
202-204: Fix whitespace before comma.

Remove the extra whitespace before the comma for consistent formatting.
     _check_model(
-        tcase, "nvnmd_models/frozen_model.pb", cwd, init_model / "frozen_model.pb"
+        tcase, "nvnmd_models/frozen_model.pb", cwd, init_model / "frozen_model.pb"
     )
dpgen2/op/run_nvnmd.py (6)
1-61: Remove unused imports to clean up the code.

The following imports are not used in this file and should be removed:
-import json
 import logging
 import os
 import random
-import re
-import shutil
 from pathlib import (
     Path,
 )
 from typing import (
     List,
     Optional,
-    Set,
-    Tuple,
     Union,
 )

 import numpy as np
 from dargs import (
-    Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
-    FatalError,
-    HDF5Datasets,
     OPIOSign,
     TransientError,
 )

 from dpgen2.constants import (
-    lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
     lmp_traj_name,
-    model_name_match_pattern,
     model_name_pattern,
     plm_output_name,
-    pytorch_model_name_pattern,
 )
247-247: Remove unnecessary f-string prefix.

The string doesn't contain any placeholders, so the f-prefix is not needed.
-    raise RuntimeError(f"last matching index should not be -1, terribly wrong ")
+    raise RuntimeError("last matching index should not be -1, terribly wrong ")
1-48: Remove unused imports to improve code cleanliness.

Multiple imports are not used in this file and should be removed to reduce clutter and improve maintainability.
-import json
 import itertools
-import json
 import logging
 import os
 import random
-import re
-import shutil
 from pathlib import (
     Path,
 )
 from typing import (
     List,
     Optional,
-    Set,
-    Tuple,
     Union,
 )

 import numpy as np
 from dargs import (
-    Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
-    FatalError,
-    HDF5Datasets,
     OPIOSign,
     TransientError,
 )

 from dpgen2.constants import (
-    lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
     lmp_traj_name,
-    model_name_match_pattern,
     model_name_pattern,
     plm_output_name,
-    pytorch_model_name_pattern,
 )
129-129: Remove unused variable assignment.

The teacher_model variable is assigned but never used in the code, which creates unnecessary overhead and potential confusion.
 command = config["command"]
-teacher_model: Optional[BinaryFileInput] = config["teacher_model_path"]
 shuffle_models: Optional[bool] = config["shuffle_models"]
248-248: Remove unnecessary f-string prefix.

The f-string doesn't contain any placeholders, so the f prefix is unnecessary.
-        raise RuntimeError(f"last matching index should not be -1, terribly wrong ")
+        raise RuntimeError("last matching index should not be -1, terribly wrong")
302-304: Optimize error calculation with early exit.

The current implementation calculates max, min, and avg even when all_errors is empty. This can be optimized by checking once and using the results.
-        max_error = np.max(all_errors) if all_errors else 0.0
-        min_error = np.min(all_errors) if all_errors else 0.0
-        avg_error = np.mean(all_errors) if all_errors else 0.0
+        if all_errors:
+            max_error = np.max(all_errors)
+            min_error = np.min(all_errors)
+            avg_error = np.mean(all_errors)
+        else:
+            max_error = min_error = avg_error = 0.0
dpgen2/op/run_nvnmd_train.py (1)
1-53: Remove unused imports to clean up the code.

The following imports are not used in this file and should be removed:
-import copy
-import glob
 import json
 import logging
 import os
 import shutil
 from pathlib import (
     Path,
 )
 from typing import (
-    Dict,
     List,
-    Optional,
-    Tuple,
-    Union,
 )

-import dpdata
 from dargs import (
-    Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
 )
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
     FatalError,
     NestedDict,
     OPIOSign,
     Parameter,
-    TransientError,
 )

 from dpgen2.constants import (
     train_cnn_script_name,
     train_qnn_script_name,
     train_script_name,
-    train_task_pattern,
 )
tests/mocked_ops.py (3)
424-424: Remove unused variable task_id.

The variable is assigned but never used in the method.
 assert re.match("task.[0-9][0-9][0-9][0-9]", ip["task_name"])
-task_id = int(ip["task_name"].split(".")[1])
 assert ip["task_name"] in str(ip["task_path"])
Also applies to: 429-429

579-579: Remove unused variable task_id.

The variable is assigned but never used in the method.
 assert re.match("task.[0-9][0-9][0-9][0-9]", ip["task_name"])
-task_id = int(ip["task_name"].split(".")[1])
 assert ip["task_name"] in str(ip["task_path"])
Also applies to: 583-583

763-763: Remove unused variables task_id and models_str.

These variables are assigned but never used in the method.
 assert re.match("task.[0-9][0-9][0-9][0-9][0-9][0-9]", ip["task_name"])
-task_id = int(ip["task_name"].split(".")[1])
 assert task_path.is_dir()
 
 task_path = task_path.resolve()
 models = [ii.resolve() for ii in models]
-models_str = [str(ii) for ii in models]
Also applies to: 767-767, 778-778, 782-782

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d7210b9 and 2b14439.

📒 Files selected for processing (22)

dpgen2/constants.py (1 hunks)
dpgen2/entrypoint/args.py (3 hunks)
dpgen2/entrypoint/submit.py (6 hunks)
dpgen2/exploration/render/traj_render_lammps.py (2 hunks)
dpgen2/exploration/task/lmp/lmp_input.py (5 hunks)
dpgen2/exploration/task/lmp_template_task_group.py (5 hunks)
dpgen2/op/__init__.py (1 hunks)
dpgen2/op/prep_dp_train.py (1 hunks)
dpgen2/op/run_dp_train.py (3 hunks)
dpgen2/op/run_nvnmd.py (1 hunks)
dpgen2/op/run_nvnmd_train.py (1 hunks)
dpgen2/superop/prep_run_dp_train.py (4 hunks)
dpgen2/utils/download_dpgen2_artifacts.py (1 hunks)
tests/mocked_ops.py (5 hunks)
tests/op/test_prep_dp_train.py (2 hunks)
tests/op/test_run_dp_train.py (5 hunks)
tests/op/test_run_nvnmd.py (1 hunks)
tests/op/test_run_nvnmd_train.py (1 hunks)
tests/test_prep_run_dp_train.py (5 hunks)
tests/test_prep_run_lmp.py (2 hunks)
tests/utils/test_dl_dpgen2_arti.py (1 hunks)
tests/utils/test_dl_dpgen2_arti_by_def.py (1 hunks)

✅ Files skipped from review due to trivial changes (2)

tests/utils/test_dl_dpgen2_arti.py
tests/utils/test_dl_dpgen2_arti_by_def.py

🚧 Files skipped from review as they are similar to previous changes (3)

dpgen2/constants.py
dpgen2/exploration/task/lmp/lmp_input.py
dpgen2/exploration/task/lmp_template_task_group.py

🧰 Additional context used

🧠 Learnings (2)

dpgen2/superop/prep_run_dp_train.py (2)

Learnt from: wanghan-iapcm
PR: deepmodeling/dpgen2#224
File: dpgen2/superop/prep_run_calypso.py:56-56
Timestamp: 2024-07-27T10:19:12.153Z
Learning: The type hint `OPTemplate` for `caly_evo_step_op` in the `PrepRunCaly` class constructor should be retained for consistency and type safety.

Learnt from: wanghan-iapcm
PR: deepmodeling/dpgen2#224
File: dpgen2/superop/prep_run_calypso.py:56-56
Timestamp: 2024-10-08T15:32:11.479Z
Learning: The type hint `OPTemplate` for `caly_evo_step_op` in the `PrepRunCaly` class constructor should be retained for consistency and type safety.

tests/mocked_ops.py (2)

Learnt from: wanghan-iapcm
PR: deepmodeling/dpgen2#224
File: dpgen2/superop/prep_run_calypso.py:56-56
Timestamp: 2024-07-27T10:19:12.153Z
Learning: The type hint `OPTemplate` for `caly_evo_step_op` in the `PrepRunCaly` class constructor should be retained for consistency and type safety.

Learnt from: wanghan-iapcm
PR: deepmodeling/dpgen2#224
File: dpgen2/superop/prep_run_calypso.py:56-56
Timestamp: 2024-10-08T15:32:11.479Z
Learning: The type hint `OPTemplate` for `caly_evo_step_op` in the `PrepRunCaly` class constructor should be retained for consistency and type safety.

🧬 Code Graph Analysis (2)

dpgen2/superop/prep_run_dp_train.py (2)

dpgen2/op/run_nvnmd_train.py (1)

RunNvNMDTrain (107-393)

dpgen2/op/run_dp_train.py (1)

RunDPTrain (106-660)

dpgen2/exploration/render/traj_render_lammps.py (1)

dpgen2/fp/vasp.py (2)

set_ele_temp (103-125)

set_ele_temp (176-188)

🪛 Ruff (0.11.9)

tests/op/test_run_nvnmd_train.py

1-1: itertools imported but unused

Remove unused import: itertools

(F401)

3-3: os imported but unused

Remove unused import: os

(F401)

10-10: numpy imported but unused

Remove unused import: numpy

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

16-16: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

17-17: dflow.python.TransientError imported but unused

Remove unused import

(F401)

30-30: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

36-36: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

40-40: dpgen2.op.run_dp_train._get_data_size_of_all_mult_sys imported but unused

Remove unused import: dpgen2.op.run_dp_train._get_data_size_of_all_mult_sys

(F401)

44-44: dpgen2.op.run_nvnmd_train._make_train_command imported but unused

Remove unused import: dpgen2.op.run_nvnmd_train._make_train_command

(F401)

653-653: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

654-654: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

tests/test_prep_run_dp_train.py

57-57: mocked_ops.MockedRunNvNMDTrainCheckOptParam imported but unused

Remove unused import: mocked_ops.MockedRunNvNMDTrainCheckOptParam

(F401)

dpgen2/op/run_nvnmd_train.py

1-1: copy imported but unused

Remove unused import: copy

(F401)

2-2: glob imported but unused

Remove unused import: glob

(F401)

11-11: typing.Dict imported but unused

Remove unused import

(F401)

13-13: typing.Optional imported but unused

Remove unused import

(F401)

14-14: typing.Tuple imported but unused

Remove unused import

(F401)

15-15: typing.Union imported but unused

Remove unused import

(F401)

18-18: dpdata imported but unused

Remove unused import: dpdata

(F401)

20-20: dargs.Argument imported but unused

Remove unused import

(F401)

21-21: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

22-22: dargs.Variant imported but unused

Remove unused import

(F401)

23-23: dargs.dargs imported but unused

Remove unused import

(F401)

34-34: dflow.python.TransientError imported but unused

Remove unused import: dflow.python.TransientError

(F401)

41-41: dpgen2.constants.train_task_pattern imported but unused

Remove unused import: dpgen2.constants.train_task_pattern

(F401)

187-187: Local variable train_args is assigned to but never used

Remove assignment to unused variable train_args

(F841)

208-211: Use ternary operator major_version = "1" if "systems" in train_dict["training"] else "2" instead of if-else-block

Replace if-else-block with major_version = "1" if "systems" in train_dict["training"] else "2"

(SIM108)

261-261: Use a context manager for opening files

(SIM115)

dpgen2/op/run_nvnmd.py

3-3: json imported but unused

Remove unused import: json

(F401)

7-7: re imported but unused

Remove unused import: re

(F401)

8-8: shutil imported but unused

Remove unused import: shutil

(F401)

16-16: typing.Set imported but unused

Remove unused import

(F401)

17-17: typing.Tuple imported but unused

Remove unused import

(F401)

22-22: dargs.Argument imported but unused

Remove unused import

(F401)

23-23: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

24-24: dargs.Variant imported but unused

Remove unused import

(F401)

25-25: dargs.dargs imported but unused

Remove unused import

(F401)

32-32: dflow.python.FatalError imported but unused

Remove unused import

(F401)

33-33: dflow.python.HDF5Datasets imported but unused

Remove unused import

(F401)

39-39: dpgen2.constants.lmp_conf_name imported but unused

Remove unused import

(F401)

44-44: dpgen2.constants.model_name_match_pattern imported but unused

Remove unused import

(F401)

47-47: dpgen2.constants.pytorch_model_name_pattern imported but unused

Remove unused import

(F401)

129-129: Local variable teacher_model is assigned to but never used

Remove assignment to unused variable teacher_model

(F841)

147-147: Do not use bare except

(E722)

159-159: Do not use bare except

(E722)

248-248: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/mocked_ops.py

424-424: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

579-579: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

763-763: Local variable task_id is assigned to but never used

Remove assignment to unused variable task_id

(F841)

778-778: Local variable models_str is assigned to but never used

Remove assignment to unused variable models_str

(F841)

dpgen2/entrypoint/args.py

35-35: dpgen2.op.run_nvnmd_train.RunNvNMDTrain imported but unused

Remove unused import: dpgen2.op.run_nvnmd_train.RunNvNMDTrain

(F401)

135-135: Local variable doc_config is assigned to but never used

Remove assignment to unused variable doc_config

(F841)

dpgen2/op/__init__.py

42-42: .run_nvnmd.RunNvNMD imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

45-45: .run_nvnmd_train.RunNvNMDTrain imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

tests/op/test_run_nvnmd.py

1-1: json imported but unused

Remove unused import: json

(F401)

9-9: dpdata imported but unused

Remove unused import: dpdata

(F401)

10-10: numpy imported but unused

Remove unused import: numpy

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

15-15: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

20-20: mock.mock imported but unused

Remove unused import: mock.mock

(F401)

26-26: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

37-37: dpgen2.op.run_lmp.get_ele_temp imported but unused

Remove unused import: dpgen2.op.run_lmp.get_ele_temp

(F401)

42-42: dpgen2.op.run_nvnmd.merge_pimd_files imported but unused

Remove unused import: dpgen2.op.run_nvnmd.merge_pimd_files

(F401)

45-45: dpgen2.utils.BinaryFileInput imported but unused

Remove unused import: dpgen2.utils.BinaryFileInput

(F401)

127-127: Local variable ee is assigned to but never used

Remove assignment to unused variable ee

(F841)

128-128: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

139-139: Local variable models is assigned to but never used

Remove assignment to unused variable models

(F841)

188-188: Local variable re is assigned to but never used

Remove assignment to unused variable re

(F841)

🪛 Flake8 (7.2.0)

tests/op/test_run_nvnmd_train.py

[error] 1-1: 'itertools' imported but unused

(F401)

[error] 3-3: 'os' imported but unused

(F401)

[error] 10-10: 'numpy as np' imported but unused

(F401)

[error] 11-11: 'dflow.python.OP' imported but unused

(F401)

[error] 11-11: 'dflow.python.Artifact' imported but unused

(F401)

[error] 11-11: 'dflow.python.OPIOSign' imported but unused

(F401)

[error] 11-11: 'dflow.python.TransientError' imported but unused

(F401)

[error] 29-29: '.context.dpgen2' imported but unused

(F401)

[error] 32-32: 'dpgen2.constants.train_task_pattern' imported but unused

(F401)

[error] 38-38: 'dpgen2.op.run_dp_train._get_data_size_of_all_mult_sys' imported but unused

(F401)

[error] 42-42: 'dpgen2.op.run_nvnmd_train._make_train_command' imported but unused

(F401)

[error] 84-84: block comment should start with '# '

(E265)

[error] 85-85: block comment should start with '# '

(E265)

[error] 86-86: block comment should start with '# '

(E265)

[error] 87-87: block comment should start with '# '

(E265)

[error] 324-324: too many blank lines (2)

(E303)

[error] 653-653: local variable 'ee' is assigned to but never used

(F841)

[error] 654-654: local variable 'out' is assigned to but never used

(F841)

tests/test_prep_run_dp_train.py

[error] 203-203: whitespace before ','

(E203)

dpgen2/op/run_nvnmd_train.py

[error] 1-1: 'copy' imported but unused

(F401)

[error] 2-2: 'glob' imported but unused

(F401)

[error] 10-10: 'typing.Dict' imported but unused

(F401)

[error] 10-10: 'typing.Optional' imported but unused

(F401)

[error] 10-10: 'typing.Tuple' imported but unused

(F401)

[error] 10-10: 'typing.Union' imported but unused

(F401)

[error] 18-18: 'dpdata' imported but unused

(F401)

[error] 19-19: 'dargs.Argument' imported but unused

(F401)

[error] 19-19: 'dargs.ArgumentEncoder' imported but unused

(F401)

[error] 19-19: 'dargs.Variant' imported but unused

(F401)

[error] 19-19: 'dargs.dargs' imported but unused

(F401)

[error] 25-25: 'dflow.python.TransientError' imported but unused

(F401)

[error] 37-37: 'dpgen2.constants.train_task_pattern' imported but unused

(F401)

[error] 187-187: local variable 'train_args' is assigned to but never used

(F841)

[error] 329-329: too many blank lines (2)

(E303)

dpgen2/op/run_nvnmd.py

[error] 3-3: 'json' imported but unused

(F401)

[error] 7-7: 're' imported but unused

(F401)

[error] 8-8: 'shutil' imported but unused

(F401)

[error] 12-12: 'typing.Set' imported but unused

(F401)

[error] 12-12: 'typing.Tuple' imported but unused

(F401)

[error] 21-21: 'dargs.Argument' imported but unused

(F401)

[error] 21-21: 'dargs.ArgumentEncoder' imported but unused

(F401)

[error] 21-21: 'dargs.Variant' imported but unused

(F401)

[error] 21-21: 'dargs.dargs' imported but unused

(F401)

[error] 27-27: 'dflow.python.FatalError' imported but unused

(F401)

[error] 27-27: 'dflow.python.HDF5Datasets' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.lmp_conf_name' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.model_name_match_pattern' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.pytorch_model_name_pattern' imported but unused

(F401)

[error] 129-129: local variable 'teacher_model' is assigned to but never used

(F841)

[error] 144-144: block comment should start with '# '

(E265)

[error] 147-147: do not use bare 'except'

(E722)

[error] 156-156: block comment should start with '# '

(E265)

[error] 159-159: do not use bare 'except'

(E722)

[error] 211-211: missing whitespace around modulo operator

(E228)

[error] 216-216: too many blank lines (2)

(E303)

[error] 217-217: missing whitespace around modulo operator

(E228)

[error] 248-248: f-string is missing placeholders

(F541)

[error] 255-255: missing whitespace around modulo operator

(E228)

[error] 255-255: missing whitespace after ','

(E231)

[error] 279-279: at least two spaces before inline comment

(E261)

[error] 324-324: expected 2 blank lines, found 1

(E302)

tests/mocked_ops.py

[error] 99-99: module level import not at top of file

(E402)

[error] 102-102: module level import not at top of file

(E402)

[error] 424-424: local variable 'task_id' is assigned to but never used

(F841)

[error] 503-503: too many blank lines (2)

(E303)

[error] 579-579: local variable 'task_id' is assigned to but never used

(F841)

[error] 763-763: local variable 'task_id' is assigned to but never used

(F841)

[error] 778-778: local variable 'models_str' is assigned to but never used

(F841)

dpgen2/entrypoint/args.py

[error] 34-34: 'dpgen2.op.run_nvnmd_train.RunNvNMDTrain' imported but unused

(F401)

[error] 135-135: local variable 'doc_config' is assigned to but never used

(F841)

dpgen2/exploration/render/traj_render_lammps.py

[error] 113-113: missing whitespace after keyword

(E275)

[error] 113-113: at least two spaces before inline comment

(E261)

dpgen2/op/__init__.py

[error] 41-41: '.run_nvnmd.RunNvNMD' imported but unused

(F401)

[error] 44-44: '.run_nvnmd_train.RunNvNMDTrain' imported but unused

(F401)

tests/op/test_run_nvnmd.py

[error] 1-1: 'json' imported but unused

(F401)

[error] 9-9: 'dpdata' imported but unused

(F401)

[error] 10-10: 'numpy as np' imported but unused

(F401)

[error] 11-11: 'dflow.python.OP' imported but unused

(F401)

[error] 11-11: 'dflow.python.Artifact' imported but unused

(F401)

[error] 11-11: 'dflow.python.OPIOSign' imported but unused

(F401)

[error] 18-18: 'mock.mock' imported but unused

(F401)

[error] 25-25: '.context.dpgen2' imported but unused

(F401)

[error] 36-36: 'dpgen2.op.run_lmp.get_ele_temp' imported but unused

(F401)

[error] 40-40: 'dpgen2.op.run_nvnmd.merge_pimd_files' imported but unused

(F401)

[error] 44-44: 'dpgen2.utils.BinaryFileInput' imported but unused

(F401)

[error] 90-90: missing whitespace around modulo operator

(E228)

[error] 91-91: missing whitespace around modulo operator

(E228)

[error] 112-112: continuation line missing indentation or outdented

(E122)

[error] 127-127: local variable 'ee' is assigned to but never used

(F841)

[error] 128-128: local variable 'out' is assigned to but never used

(F841)

[error] 139-139: local variable 'models' is assigned to but never used

(F841)

[error] 156-156: continuation line missing indentation or outdented

(E122)

[error] 188-188: local variable 're' is assigned to but never used

(F841)

🪛 Pylint (3.3.7)

tests/op/test_run_nvnmd_train.py

[refactor] 50-50: Too many instance attributes (26/7)

(R0902)

[refactor] 678-678: Too many instance attributes (15/7)

(R0902)

tests/test_prep_run_dp_train.py

[refactor] 181-181: Too many arguments (7/5)

(R0913)

[refactor] 181-181: Too many positional arguments (7/5)

(R0917)

[refactor] 380-380: Too many instance attributes (11/7)

(R0902)

dpgen2/op/run_nvnmd_train.py

[refactor] 152-152: Too many local variables (47/15)

(R0914)

[refactor] 261-261: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 152-152: Too many branches (15/12)

(R0912)

[refactor] 152-152: Too many statements (98/50)

(R0915)

dpgen2/op/run_nvnmd.py

[refactor] 97-97: Too many local variables (26/15)

(R0914)

[refactor] 274-274: Too many local variables (20/15)

(R0914)

tests/mocked_ops.py

[refactor] 409-409: Too many local variables (34/15)

(R0914)

[refactor] 409-409: Too many statements (102/50)

(R0915)

[refactor] 564-564: Too many local variables (29/15)

(R0914)

[refactor] 564-564: Too many statements (78/50)

(R0915)

[refactor] 676-676: Too many local variables (17/15)

(R0914)

dpgen2/entrypoint/submit.py

[refactor] 311-316: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

🔇 Additional comments (17)

dpgen2/op/__init__.py (1)

41-46: LGTM! New imports support NvNMD functionality.

The addition of RunNvNMD and RunNvNMDTrain imports correctly exposes these new operator classes for the NvNMD training and exploration workflows. The static analysis warnings about unused imports are false positives since __init__.py files are used for namespace exposure.

tests/op/test_run_dp_train.py (1)

134-134: LGTM! Test updates align with checkpoint saving functionality.

The addition of "save_ckpt": "model.ckpt" to all expected test configurations correctly reflects the new checkpoint saving behavior in DP training. The updates are consistent across all test variants (v1, v2, initial model, and null iter data cases).

Also applies to: 161-161, 201-201, 226-226, 817-817

dpgen2/utils/download_dpgen2_artifacts.py (1)

362-364: LGTM! Defensive programming improvement for artifact handling.

Using .get() method instead of direct dictionary access prevents KeyError exceptions when artifacts are missing, allowing graceful handling of missing artifacts. This improvement supports the more complex artifact management required by the new NvNMD workflows.

dpgen2/op/prep_dp_train.py (1)

122-136: LGTM! Proper support for both DP and NvNMD training variants.

The updated _script_rand_seed method correctly handles random seed initialization for both traditional DP models ("model" key) and the new NvNMD variant ("nvnmd" key). The logic maintains backward compatibility while adding support for the new NvNMD training workflow.

tests/op/test_prep_dp_train.py (3)

59-77: LGTM! New NVNMD template scripts follow the established pattern.

The new template scripts for NVNMD versions 0 and 1 are well-structured and consistent with existing templates.

184-224: LGTM! Test methods properly verify NVNMD template functionality.

The test methods correctly verify that:

NVNMD version numbers are preserved (version 0 and 1)

Seed values are properly assigned (2 * ii + 0 for nvnmd, 2 * ii + 1 for training)

Output files and directories are created as expected

The tests follow the established pattern used for other template types.

233-234: LGTM! Updated error test includes new templates.

The test correctly includes the new NVNMD templates in the list to verify that runtime errors are raised when template list length is incorrect.

dpgen2/op/run_dp_train.py (3)

1-1: LGTM! Added necessary import for deep copying.

The copy import is needed for the deepcopy operation used later in the file.

418-422: LGTM! Enhanced method signature and improved copying strategy.

Good changes:

Added do_quantized parameter to support quantized training workflows

Switched from shallow copy to deepcopy for safer dictionary manipulation

Added save_ckpt configuration for checkpoint saving

The deepcopy is particularly important since the method modifies nested dictionary structures.

444-449: LGTM! Quantized training logic correctly disables training steps.

The logic appropriately sets training steps to 0 for quantized models based on DeePMD-kit version:

Version 1: Uses stop_batch

Version 2: Uses numb_steps

This aligns with the expected behavior for quantized neural network training where the model structure is defined but no actual training occurs.

dpgen2/superop/prep_run_dp_train.py (3)

14-14: LGTM! Added necessary imports for enhanced type support.

The Union import and RunNvNMDTrain import are correctly added to support the expanded type annotations.

Also applies to: 51-51

64-64: LGTM! Type annotation properly supports both training operators.

The type hint Type[Union[RunDPTrain, RunNvNMDTrain]] correctly allows the constructor to accept either training operator while maintaining type safety.

155-155: LGTM! Generalized type hint for flexibility.

Broadening the type from Type[RunDPTrain] to Type[OP] provides appropriate flexibility for the internal function while maintaining the more specific Union type at the public interface level.

dpgen2/entrypoint/submit.py (1)

311-316: LGTM!

The function correctly handles both "lmp" and "lmp-nvnmd" exploration styles by delegating to the appropriate scheduler function.
tests/mocked_ops.py (1)

378-410: Verify if PrepNvNMDTrain class exists for proper inheritance.

The mock currently inherits from PrepDPTrain but appears to be mocking NvNMD preparation operations. If PrepNvNMDTrain exists, the inheritance should be updated.
#!/bin/bash
# Check if PrepNvNMDTrain class exists in the codebase
ast-grep --pattern 'class PrepNvNMDTrain'

# Also check for any references to PrepNvNMDTrain
rg "PrepNvNMDTrain" -A 2
dpgen2/op/run_nvnmd.py (2)

324-342: Approve model deviation output function.

The write_model_devi_out function correctly formats and writes model deviation data with proper headers and formatting. The implementation handles binary file writing appropriately for numpy's savetxt function.

273-322: Verify ASE version compatibility and consider error handling.

The function imports ASE and uses specific format parsing. The comment mentions ASE version requirements, but there's no version checking or graceful handling if ASE is not available or incompatible.

Consider adding version checking and proper error handling:
 def calc_model_devi(
     traj_files,
     fname="model_devi.out",
 ):
+    try:
         from ase.io import read  # type: ignore
+    except ImportError:
+        raise ImportError("ASE is required for model deviation calculation. Please install with: pip install ase")
Additionally, verify that the ASE version requirement is still accurate and document it properly.
#!/bin/bash
# Check if ASE is properly imported and what version is typically used
python -c "
try:
    import ase
    print(f'ASE version: {ase.__version__}')
    from ase.io import read
    print('ASE import successful')
except ImportError as e:
    print(f'ASE import failed: {e}')
"

coderabbitai · 2025-06-27T01:48:33Z

tests/op/test_run_nvnmd.py

+import json
+import os
+import shutil
+import unittest
+from pathlib import (
+    Path,
+)
+
+import dpdata
+import numpy as np
+from dflow.python import (
+    OP,
+    OPIO,
+    Artifact,
+    OPIOSign,
+    TransientError,
+)
+from mock import (
+    call,
+    mock,
+    patch,
+)
+
+# isort: off
+from .context import (
+    dpgen2,
+)
+from dpgen2.constants import (
+    lmp_conf_name,
+    lmp_input_name,
+    lmp_log_name,
+    lmp_model_devi_name,
+    lmp_traj_name,
+    model_name_pattern,
+)
+from dpgen2.op.run_lmp import get_ele_temp, set_models
+from dpgen2.op.run_nvnmd import (
+    RunNvNMD,
+    merge_pimd_files,
+)
+from dpgen2.utils import (
+    BinaryFileInput,
+)
+


⚠️ Potential issue

Clean up unused imports and address formatting issues.

The test file has comprehensive test coverage, but several issues need to be addressed:

Unused imports (should be removed):

json, dpdata, numpy

OP, Artifact, OPIOSign from dflow.python

mock from mock

dpgen2 from context

get_ele_temp, merge_pimd_files

BinaryFileInput

Formatting issues:

Missing whitespace around modulo operators (lines 90, 91)

Continuation line indentation issues

Apply this diff to remove unused imports:

-import json import os import shutil import unittest from pathlib import ( Path, ) -import dpdata -import numpy as np from dflow.python import ( - OP, OPIO, - Artifact, - OPIOSign, TransientError, ) from mock import ( call, - mock, patch, ) # isort: off -from .context import ( - dpgen2, -) from dpgen2.constants import ( lmp_conf_name, lmp_input_name, lmp_log_name, lmp_model_devi_name, lmp_traj_name, model_name_pattern, ) -from dpgen2.op.run_lmp import get_ele_temp, set_models +from dpgen2.op.run_lmp import set_models from dpgen2.op.run_nvnmd import ( RunNvNMD, - merge_pimd_files, ) -from dpgen2.utils import ( - BinaryFileInput, -)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import json

import os

import shutil

import unittest

from pathlib import (

Path,

)

import dpdata

import numpy as np

from dflow.python import (

OP,

OPIO,

Artifact,

OPIOSign,

TransientError,

)

from mock import (

call,

mock,

patch,

)

# isort: off

from .context import (

dpgen2,

)

from dpgen2.constants import (

lmp_conf_name,

lmp_input_name,

lmp_log_name,

lmp_model_devi_name,

lmp_traj_name,

model_name_pattern,

)

from dpgen2.op.run_lmp import get_ele_temp, set_models

from dpgen2.op.run_nvnmd import (

RunNvNMD,

merge_pimd_files,

)

from dpgen2.utils import (

BinaryFileInput,

)

import os

import shutil

import unittest

from pathlib import (

Path,

)

from dflow.python import (

OPIO,

TransientError,

)

from mock import (

call,

patch,

)

# isort: off

from dpgen2.constants import (

lmp_conf_name,

lmp_input_name,

lmp_log_name,

lmp_model_devi_name,

lmp_traj_name,

model_name_pattern,

)

from dpgen2.op.run_lmp import set_models

from dpgen2.op.run_nvnmd import (

RunNvNMD,

)

🧰 Tools

🪛 Ruff (0.11.9)

1-1: json imported but unused

Remove unused import: json

(F401)

9-9: dpdata imported but unused

Remove unused import: dpdata

(F401)

10-10: numpy imported but unused

Remove unused import: numpy

(F401)

12-12: dflow.python.OP imported but unused

Remove unused import

(F401)

14-14: dflow.python.Artifact imported but unused

Remove unused import

(F401)

15-15: dflow.python.OPIOSign imported but unused

Remove unused import

(F401)

20-20: mock.mock imported but unused

Remove unused import: mock.mock

(F401)

26-26: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

37-37: dpgen2.op.run_lmp.get_ele_temp imported but unused

Remove unused import: dpgen2.op.run_lmp.get_ele_temp

(F401)

42-42: dpgen2.op.run_nvnmd.merge_pimd_files imported but unused

Remove unused import: dpgen2.op.run_nvnmd.merge_pimd_files

(F401)

🪛 Flake8 (7.2.0)

[error] 1-1: 'json' imported but unused

(F401)

[error] 9-9: 'dpdata' imported but unused

(F401)

[error] 10-10: 'numpy as np' imported but unused

(F401)

[error] 11-11: 'dflow.python.OP' imported but unused

(F401)

[error] 11-11: 'dflow.python.Artifact' imported but unused

(F401)

[error] 11-11: 'dflow.python.OPIOSign' imported but unused

(F401)

[error] 18-18: 'mock.mock' imported but unused

(F401)

[error] 25-25: '.context.dpgen2' imported but unused

(F401)

[error] 36-36: 'dpgen2.op.run_lmp.get_ele_temp' imported but unused

(F401)

[error] 40-40: 'dpgen2.op.run_nvnmd.merge_pimd_files' imported but unused

(F401)

[error] 44-44: 'dpgen2.utils.BinaryFileInput' imported but unused

(F401)

🤖 Prompt for AI Agents

In tests/op/test_run_nvnmd.py from lines 1 to 44, remove the unused imports json, dpdata, numpy, OP, Artifact, OPIOSign, mock, dpgen2, get_ele_temp, merge_pimd_files, and BinaryFileInput. Also, fix formatting issues by adding whitespace around modulo operators on lines 90 and 91 and correct continuation line indentation to follow PEP8 standards.

dpgen2/entrypoint/args.py

coderabbitai · 2025-06-27T01:48:33Z

dpgen2/entrypoint/submit.py

    output_nopbc = config["explore"]["output_nopbc"]
    conf_filters = get_conf_filters(config["explore"]["filters"])
    use_ele_temp = config["inputs"]["use_ele_temp"]
+    config["explore"]["type"]


⚠️ Potential issue

Remove unused variable extraction.

The explore type is extracted from the config but never used.

- config["explore"]["type"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

config["explore"]["type"]

🤖 Prompt for AI Agents

In dpgen2/entrypoint/submit.py at line 399, the variable extracting the explore type from the config is assigned but never used. Remove this line entirely to clean up the code and avoid unused variable warnings.

tests/test_prep_run_dp_train.py

dpgen2/op/run_nvnmd.py

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

dpgen2/op/run_nvnmd.py (3)
146-148: Replace bare except with specific exception handling.

Bare except statements can hide unexpected errors. Catch specific exceptions instead.
                 try:
                     Path(iname).symlink_to(ii)
-                except:
+                except (FileExistsError, OSError):
                     logging.warning("failed to link %s, maybe already linked" % iname)
                     pass
                     try:
                         Path(mname).symlink_to(mm)
-                    except:
+                    except (FileExistsError, OSError):
                         logging.warning(
                             "failed to link %s, maybe already linked" % mname
                         )
                         pass
Also applies to: 158-160

285-287: Improve assertion error message for better debugging.

The assertion error message has been improved but could be more specific about which trajectory file has the mismatch.

The assertion error message improvement looks good - it now provides clear debugging information about expected vs actual frame counts.

97-226: Refactor execute method to reduce complexity.

The execute method has high complexity with many local variables and branches. Consider extracting logical sections into separate methods for better maintainability.

Consider breaking down the execute method into smaller, focused methods:

_prepare_work_directory() - for setting up the working directory and linking files

_prepare_models() - for handling model files and model name generation

_run_lammps_commands() - for executing LAMMPS commands

_process_results() - for merging files and preparing output

Would you like me to generate the refactored code structure?

🧹 Nitpick comments (5)

dpgen2/op/run_nvnmd.py (3)

1-60: Clean up unused imports to improve code maintainability.

Multiple imports are unused and should be removed to keep the code clean and reduce unnecessary dependencies.
-import json
-import re
-import shutil
 from typing import (
     List,
     Optional,
-    Set,
-    Tuple,
     Union,
 )
-from dargs import (
-    Argument,
-    ArgumentEncoder,
-    Variant,
-    dargs,
-)
 from dflow.python import (
     OP,
     OPIO,
     Artifact,
     BigParameter,
-    FatalError,
-    HDF5Datasets,
     OPIOSign,
     TransientError,
 )
 from dpgen2.constants import (
-    lmp_conf_name,
     lmp_input_name,
     lmp_log_name,
     lmp_model_devi_name,
     lmp_traj_name,
-    model_name_match_pattern,
     model_name_pattern,
     plm_output_name,
-    pytorch_model_name_pattern,
 )
-from dpgen2.utils import (
-    BinaryFileInput,
-    set_directory,
-)
+from dpgen2.utils import set_directory
246-246: Remove unnecessary f-string prefix.

The f-string doesn't contain any placeholders, so the f prefix is unnecessary.
-        raise RuntimeError(f"last matching index should not be -1, terribly wrong ")
+        raise RuntimeError("last matching index should not be -1, terribly wrong")
272-322: Consider optimizing model deviation calculation for large datasets.

The current implementation loads all trajectories into memory and performs nested loops for force comparisons, which could be memory-intensive for large datasets.

For large trajectory files, consider:

Processing frames in batches to reduce memory usage

Using numpy vectorized operations for force difference calculations

Adding progress logging for long-running calculations
 def calc_model_devi(
     traj_files,
     fname="model_devi.out",
+    batch_size=1000,  # Process frames in batches
 ):

tests/test_prep_run_dp_train.py (2)

150-150: Remove debug print statement.

The print statement appears to be leftover debug code and should be removed.

     with open(Path(path) / script) as fp:
         mlines = fp.read().strip().split("\n")
     tcase.assertEqual(flines[0], "read from train_script: ")
     for ii in range(len(mlines)):
-        print(flines[ii + 1], mlines[ii])
         tcase.assertEqual(flines[ii + 1], mlines[ii])

180-219: Consider reducing function complexity.

The check_run_train_nvnmd_output function has many parameters and performs multiple file checks. Consider extracting some of the validation logic.

The function works correctly but could be more maintainable:

+def _check_nvnmd_model_files(tcase, cwd, init_model):
+    """Helper to check all nvnmd model files."""
+    model_checks = [
+        ("nvnmd_models/frozen_model.pb", init_model / "frozen_model.pb"),
+        ("nvnmd_models/model.pb", init_model / "frozen_model.pb"),
+        ("nvnmd_models/model.ckpt.meta", init_model / "model.ckpt.meta"),
+        ("nvnmd_models/model.ckpt.data-00000-of-00001", init_model / "model.ckpt.data"),
+        ("nvnmd_models/model.ckpt.index", init_model / "model.ckpt.index"),
+    ]
+    for output_file, expected_file in model_checks:
+        _check_model(tcase, output_file, cwd, expected_file)
+
 def check_run_train_nvnmd_output(
     tcase,
     work_dir,
     script,
     init_model,
     init_data,
     iter_data,
     only_check_name=False,
 ):
     cwd = os.getcwd()
     os.chdir(work_dir)
     _check_log(
         tcase,
         "log",
         cwd,
         script,
         init_model,
         init_data,
         iter_data,
         only_check_name=only_check_name,
     )
-    _check_model(
-        tcase, "nvnmd_models/frozen_model.pb", cwd, init_model / "frozen_model.pb"
-    )
-    _check_model(tcase, "nvnmd_models/model.pb", cwd, init_model / "frozen_model.pb")
-    _check_model(
-        tcase, "nvnmd_models/model.ckpt.meta", cwd, init_model / "model.ckpt.meta"
-    )
-    _check_model(
-        tcase,
-        "nvnmd_models/model.ckpt.data-00000-of-00001",
-        cwd,
-        init_model / "model.ckpt.data",
-    )
-    _check_model(
-        tcase, "nvnmd_models/model.ckpt.index", cwd, init_model / "model.ckpt.index"
-    )
+    _check_nvnmd_model_files(tcase, cwd, init_model)
     _check_lcurve(tcase, "nvnmd_cnn/lcurve.out", cwd, script)
     os.chdir(cwd)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b14439 and 3c8982c.

📒 Files selected for processing (3)

dpgen2/entrypoint/args.py (2 hunks)
dpgen2/op/run_nvnmd.py (1 hunks)
tests/test_prep_run_dp_train.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

dpgen2/entrypoint/args.py

🧰 Additional context used

🪛 Ruff (0.11.9)

dpgen2/op/run_nvnmd.py

3-3: json imported but unused

Remove unused import: json

(F401)

7-7: re imported but unused

Remove unused import: re

(F401)

8-8: shutil imported but unused

Remove unused import: shutil

(F401)

15-15: typing.Set imported but unused

Remove unused import

(F401)

16-16: typing.Tuple imported but unused

Remove unused import

(F401)

22-22: dargs.Argument imported but unused

Remove unused import

(F401)

23-23: dargs.ArgumentEncoder imported but unused

Remove unused import

(F401)

24-24: dargs.Variant imported but unused

Remove unused import

(F401)

25-25: dargs.dargs imported but unused

Remove unused import

(F401)

32-32: dflow.python.FatalError imported but unused

Remove unused import

(F401)

33-33: dflow.python.HDF5Datasets imported but unused

Remove unused import

(F401)

39-39: dpgen2.constants.lmp_conf_name imported but unused

Remove unused import

(F401)

44-44: dpgen2.constants.model_name_match_pattern imported but unused

Remove unused import

(F401)

47-47: dpgen2.constants.pytorch_model_name_pattern imported but unused

Remove unused import

(F401)

54-54: dpgen2.utils.BinaryFileInput imported but unused

Remove unused import: dpgen2.utils.BinaryFileInput

(F401)

146-146: Do not use bare except

(E722)

158-158: Do not use bare except

(E722)

246-246: f-string without any placeholders

Remove extraneous f prefix

(F541)

🪛 Flake8 (7.2.0)

dpgen2/op/run_nvnmd.py

[error] 3-3: 'json' imported but unused

(F401)

[error] 7-7: 're' imported but unused

(F401)

[error] 8-8: 'shutil' imported but unused

(F401)

[error] 12-12: 'typing.Set' imported but unused

(F401)

[error] 12-12: 'typing.Tuple' imported but unused

(F401)

[error] 21-21: 'dargs.Argument' imported but unused

(F401)

[error] 21-21: 'dargs.ArgumentEncoder' imported but unused

(F401)

[error] 21-21: 'dargs.Variant' imported but unused

(F401)

[error] 21-21: 'dargs.dargs' imported but unused

(F401)

[error] 27-27: 'dflow.python.FatalError' imported but unused

(F401)

[error] 27-27: 'dflow.python.HDF5Datasets' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.lmp_conf_name' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.model_name_match_pattern' imported but unused

(F401)

[error] 38-38: 'dpgen2.constants.pytorch_model_name_pattern' imported but unused

(F401)

[error] 53-53: 'dpgen2.utils.BinaryFileInput' imported but unused

(F401)

[error] 146-146: do not use bare 'except'

(E722)

[error] 158-158: do not use bare 'except'

(E722)

[error] 246-246: f-string is missing placeholders

(F541)

🪛 Pylint (3.3.7)

dpgen2/op/run_nvnmd.py

[refactor] 97-97: Too many local variables (25/15)

(R0914)

[refactor] 272-272: Too many local variables (20/15)

(R0914)

tests/test_prep_run_dp_train.py

[refactor] 180-180: Too many arguments (7/5)

(R0913)

[refactor] 180-180: Too many positional arguments (7/5)

(R0917)

[refactor] 318-318: Too many instance attributes (8/7)

(R0902)

[refactor] 589-589: Too many instance attributes (11/7)

(R0902)

🔇 Additional comments (3)

dpgen2/op/run_nvnmd.py (1)

231-231: ```shell
#!/bin/bash

Show lmp_args and normalize_config definitions in run_lmp.py

sed -n '1,200p' dpgen2/op/run_lmp.py


</details>
<details>
<summary>tests/test_prep_run_dp_train.py (2)</summary>

`318-405`: **LGTM! Well-structured test class for NvNMD training.**

The `TestMockedRunNvNMDTrain` class provides comprehensive testing for the NvNMD training functionality with proper setup, teardown, and assertion checking.



The test structure follows the existing pattern and provides good coverage of the NvNMD training workflow.

---

`588-709`: **LGTM! Comprehensive integration tests for NvNMD training.**

The `TestTrainNvNMD` class provides thorough integration testing for both regular NvNMD training and training without initial models.



The test class properly:
- Sets up artifacts and workflow steps
- Tests both training scenarios (with and without initial models)
- Validates outputs using the appropriate helper function
- Follows the established testing patterns

</details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

dpgen2/op/run_nvnmd.py (1)

129-131: Fix bare except clauses for better error handling.

The bare except: clauses can mask unexpected errors and system exits. Replace with specific exception handling.

Apply this diff to handle specific exceptions:

                 try:
                     Path(iname).symlink_to(ii)
-                except:
+                except (FileExistsError, OSError):
                     logging.warning("failed to link %s, maybe already linked" % iname)
                     pass

And for the second occurrence:

                     try:
                         Path(mname).symlink_to(mm)
-                    except:
+                    except (FileExistsError, OSError):
                         logging.warning(
                             "failed to link %s, maybe already linked" % mname
                         )
                         pass

Also applies to: 141-145

🧹 Nitpick comments (1)

dpgen2/op/run_nvnmd.py (1)

80-208: Consider refactoring the complex execute method.

The execute method has high complexity with 25 local variables and complex logic flow. Consider extracting smaller, focused methods for better maintainability.

Potential refactoring approach:

_setup_working_directory() - handle file linking and directory setup

_prepare_models() - handle model file processing and shuffling

_execute_lammps_runs() - handle the LAMMPS execution loop

_process_outputs() - handle trajectory merging and model deviation calculation

This would improve readability and make the code easier to test and maintain.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c8982c and c50b411.

📒 Files selected for processing (2)

dpgen2/op/run_nvnmd.py (1 hunks)
tests/test_prep_run_dp_train.py (5 hunks)

🧰 Additional context used

🪛 Ruff (0.11.9)

dpgen2/op/run_nvnmd.py

129-129: Do not use bare except

(E722)

141-141: Do not use bare except

(E722)

🪛 Flake8 (7.2.0)

dpgen2/op/run_nvnmd.py

[error] 129-129: do not use bare 'except'

(E722)

[error] 141-141: do not use bare 'except'

(E722)

🪛 Pylint (3.3.7)

dpgen2/op/run_nvnmd.py

[refactor] 80-80: Too many local variables (25/15)

(R0914)

[refactor] 255-255: Too many local variables (20/15)

(R0914)

tests/test_prep_run_dp_train.py

[refactor] 192-192: Too many arguments (7/5)

(R0913)

[refactor] 192-192: Too many positional arguments (7/5)

(R0917)

[refactor] 315-315: Too many instance attributes (8/7)

(R0902)

[refactor] 586-586: Too many instance attributes (11/7)

(R0902)

🔇 Additional comments (8)

tests/test_prep_run_dp_train.py (5)

56-60: LGTM! Clean import organization for NvNMD support.

The new imports are well-organized and necessary for the NvNMD testing functionality.

138-149: Well-structured helper function for NvNMD model validation.

The helper function properly validates all required NvNMD model files with clear expectations and good separation of concerns.

192-216: Comprehensive output validation for NvNMD training.

The function correctly validates NvNMD-specific outputs including the nvnmd_cnn/lcurve.out location, which differs from standard DP training.

315-402: Thorough test implementation for mocked NvNMD training.

The test class properly mirrors the structure of the DP training tests while correctly handling NvNMD-specific model structure (directory-based instead of single file) and output validation.

585-706: Complete integration test coverage for NvNMD workflow.

The integration test class provides comprehensive coverage including both normal training and training without initial models, ensuring the NvNMD workflow is properly tested end-to-end.

dpgen2/op/run_nvnmd.py (3)

217-238: Well-implemented LAMMPS input modification logic.

The set_lmp_models function correctly handles finding and replacing model references in LAMMPS input files, with appropriate error handling for edge cases.

255-305: Robust model deviation calculation implementation.

The calc_model_devi function correctly:

Validates trajectory frame consistency

Computes force differences between model pairs

Calculates meaningful statistics (max, min, avg)

Handles edge cases with empty error lists

The implementation aligns well with molecular dynamics model deviation analysis requirements.

308-325: Clean output formatting for model deviation data.

The write_model_devi_out function properly formats the deviation data with appropriate headers and numerical precision for downstream analysis.

jiongwalai and others added 30 commits April 24, 2025 14:47

support NvNMD-train and NvNMD-explore

73b4a48

fix run nvnmd

7f60686

fix rerun command in nvnmd if number of atoms in dump is changing.

5255d2e

add model deviation calculation function

91a6620

fix rerun in lmp-nvnmd-template

48aac9c

fix rerun command in lmp-nvnmd-template

8a6918c

fix traj reader lammps for multi-systems in a traj

ff5da8d

support init-model from freezed model in nvnmd

2d4e32d

fix args in nvnmd

c1dd741

add nvnmd test unit

6b3a0e1

add nvnmd test unit

af591a4

add nvnmd test unit

2c0ea10

fix test unit

6c69e4e

fix test unit

77a0e3a

fix test unit

afecaad

fix test unit

e542b0a

fix test unit

ca17fe1

fix test unit

694889d

fix run_nvnmd

daae459

fix test unit

26ab174

fix run nvnmd

c4e3f46

fix test unit

a89bd70

support init model from model.ckpt in dp-nvnmd train

2aa102b

fix nvnmd test unit

c1e46ac

fix nvnmd test unit

3791618

Merge branch 'deepmodeling:master' into master

e5a16ec

fix nvnmd test unit

8064b74

support nvnmd train and inference in dpgen2

228f620

support nvnmd train and inference

fix dpgen_loop in nvnmd

b73ea11

fix run nvnmd command && fix model_ckpt input and output

ba7fc22

jiongwalai and others added 5 commits June 11, 2025 13:55

fix dl test unit

b8bdce6

fix dl art

35804c5

fix dl test unit

26db68f

fix dl test unit

849ffeb

[pre-commit.ci] auto fixes from pre-commit.com hooks

d7210b9

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jun 11, 2025

View reviewed changes

zjgemi reviewed Jun 16, 2025

View reviewed changes

pre-commit-ci bot and others added 13 commits June 22, 2025 18:43

reuse PrepRunDPTrain and PrepRunLmp superop

05ba10f

fix download testunit

38c50ea

fix download testunit

d33ce44

fix lmp template task group in lmp-nvnmd

d7eac9a

fix lmp-nvnmd ipnut file

ed4cbbe

fix lmp-nvnmd testunit

179fa31

Merge pull request #2 from jiongwalai/nvnmd_0623

40d9c82

reuse PrepRunDPTrain and PrepRunLmp superop

reuse PrepRunDPTrain and PrepRunLmp superop

3e83044

delete info

a232ae1

remove preprunnvnmd superop

7a5924d

remove prep_run_nvnmd_train superop

ede835a

reuse PrepRunDPTrain and PrepRunLmp superop

86b2a66

reuse PrepRunDPTrain and PrepRunLmp superop

[pre-commit.ci] auto fixes from pre-commit.com hooks

2b14439

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jun 27, 2025

View reviewed changes

jiongwalai and others added 2 commits June 28, 2025 10:20

remove prep_run_nvnmd_train superop && accept suggestion

2a25efe

[pre-commit.ci] auto fixes from pre-commit.com hooks

3c8982c

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jun 28, 2025

View reviewed changes

jiongwalai and others added 2 commits June 28, 2025 11:06

add _check_nvnmd_model_files function and remove debug info

aa2e96b

[pre-commit.ci] auto fixes from pre-commit.com hooks

c50b411

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jun 28, 2025

View reviewed changes

	train_args = config.get("train_args", "")
	config = RunNvNMDTrain.normalize_config(config)

support NvNMD-train and NvNMD-explore #298

Are you sure you want to change the base?

support NvNMD-train and NvNMD-explore #298

Uh oh!

Conversation

jiongwalai commented Jun 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

NVNMD Training Workflow (High-Level)

NVNMD LAMMPS Exploration Task

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

jiongwalai commented Jun 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 11, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)