## Description: For reward model env https://github.com/NVIDIA-NeMo/RL/pull/1026 The implementation from def score(): to the referenced section is largely overlapping with get_logprobs. To reduce duplication and improve maintainability, extract the common logic into shared utilities or a helper function. Keep behavior unchanged; this should be a pure refactor. Add/adjust unit tests to ensure parity. ## Scope: - nemo_rl/models/policy/dtensor_policy_worker_v2.py - nemo_rl/models/policy/dtensor_policy_worker.py ## Acceptance Criteria: No functional changes; existing APIs and outputs remain identical. score() and get_logprobs no longer duplicate core logic.