Add entropy logging for SFT training path #9717

pankd · 2026-01-05T08:18:32Z

This PR introduces entropy logging during the SFT training path, enabling better monitoring and analysis of model behavior during training. A sample test has been added with entropy logging enabled.

Fixes # 9306

Signed-off-by: Pankaj Dixit <[email protected]>

gemini-code-assist · 2026-01-05T08:18:50Z

Summary of Changes

Hello @pankd, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the SFT training path by integrating entropy logging, which allows for deeper insights into model uncertainty and prediction distribution during the training process. By adding a new configuration argument and modifying the training loop, users can now easily enable the logging of mean entropy, facilitating better monitoring and analysis of model behavior and convergence. A new utility function handles the entropy calculation, and an example script showcases its usage.

Highlights

Entropy Logging Argument: Introduced a new log_entropy boolean argument in FinetuningArguments to enable or disable entropy logging during training.
SFT Trainer Integration: Modified the SFT trainer's compute_loss method to conditionally calculate and store entropy based on the log_entropy flag, and updated the log method to include this entropy in the training logs.
Entropy Calculation Utility: Added a compute_entropy utility function in trainer_utils.py responsible for calculating the mean entropy from model logits and target labels.
Example Script: Provided a new example shell script (llama3.2_1B_lora_sft.sh) demonstrating how to use the new --log_entropy flag during Llama 3.2 1B LoRA SFT training.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature for logging token entropy during SFT training, which is great for monitoring model uncertainty. The implementation is solid, adding the necessary command-line argument and integrating the entropy calculation into the training loop.

I've provided a few suggestions to enhance the code. In the new example script, I've recommended quoting shell variables for better robustness. For the Python code, I've suggested a refactoring in compute_loss to improve clarity and a more efficient and stable way to calculate entropy in compute_entropy using PyTorch's built-in distribution utilities.

Overall, these are great additions. My feedback aims to refine the implementation for better maintainability and performance.

gemini-code-assist · 2026-01-05T08:21:19Z

examples/train_lora/llama3.2_1B_lora_sft.sh

+MODEL_PATH=meta-llama/Llama-3.2-1B
+
+llamafactory-cli train \
+    --model_name_or_path ${MODEL_PATH} \


It's a good practice to quote shell variables to prevent issues with word splitting and globbing, especially if the paths could contain spaces or special characters. I'd recommend quoting ${MODEL_PATH}.

Suggested change

--model_name_or_path ${MODEL_PATH} \

--model_name_or_path "${MODEL_PATH}" \

gemini-code-assist · 2026-01-05T08:21:19Z

examples/train_lora/llama3.2_1B_lora_sft.sh

+    --overwrite_cache \
+    --preprocessing_num_workers 16 \
+    --dataloader_num_workers 4 \
+    --output_dir ${OUTPUT} \


Similarly, it's best to quote the OUTPUT variable to ensure the script behaves correctly even if the path contains spaces.

Suggested change

--output_dir ${OUTPUT} \

--output_dir "${OUTPUT}" \

gemini-code-assist · 2026-01-05T08:21:19Z

examples/train_lora/llama3.2_1B_lora_sft.sh

+    --log_entropy \
+    --ddp_timeout 180000000 > "$OUTPUT/train.log" 2>&1
+
+echo "Training completed. Logs are saved to: $OUTPUT/train.log"


It's a common convention to end files with a newline character. Please add one here.

Suggested change

echo "Training completed. Logs are saved to: $OUTPUT/train.log"

echo "Training completed. Logs are saved to: $OUTPUT/train.log"

gemini-code-assist · 2026-01-05T08:21:19Z

src/llamafactory/train/sft/trainer.py

+    def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
+        # Always get outputs if we need entropy, otherwise follow the request
+        need_outputs = return_outputs or getattr(self.finetuning_args, 'log_entropy', False)
+
+        if need_outputs:
+            loss, outputs = super().compute_loss(model, inputs, return_outputs=True, **kwargs)
+        else:
+            loss = super().compute_loss(model, inputs, return_outputs=False, **kwargs)
+            outputs = None
+
+        # Compute entropy if enabled
+        if getattr(self.finetuning_args, 'log_entropy', False) and outputs is not None:
+            if hasattr(outputs, 'logits') and 'labels' in inputs:
+                from ..trainer_utils import compute_entropy
+
+                with torch.no_grad():
+                    # Use the already-computed logits (detached to avoid affecting gradients)
+                    entropy = compute_entropy(outputs.logits.detach(), inputs['labels'])
+                    self._current_entropy = entropy.item()
+
+        if return_outputs:
+            return loss, outputs
+        return loss


This method can be slightly refactored for better readability and to avoid a repeated getattr call. Storing the log_entropy flag in a local variable at the beginning makes the intent clearer throughout the method.

def compute_loss(self, model, inputs, return_outputs=False, **kwargs): log_entropy = getattr(self.finetuning_args, "log_entropy", False) # Always get outputs if we need entropy, otherwise follow the request need_outputs = return_outputs or log_entropy if need_outputs: loss, outputs = super().compute_loss(model, inputs, return_outputs=True, **kwargs) else: loss = super().compute_loss(model, inputs, return_outputs=False, **kwargs) outputs = None # Compute entropy if enabled if log_entropy and outputs is not None: if hasattr(outputs, "logits") and "labels" in inputs: from ..trainer_utils import compute_entropy with torch.no_grad(): # Use the already-computed logits (detached to avoid affecting gradients) entropy = compute_entropy(outputs.logits.detach(), inputs["labels"]) self._current_entropy = entropy.item() if return_outputs: return loss, outputs return loss

gemini-code-assist · 2026-01-05T08:21:19Z

src/llamafactory/train/trainer_utils.py

+    # Compute probabilities via softmax
+    probs = torch.nn.functional.softmax(shift_logits, dim=-1)
+
+    # Compute entropy: -sum(p * log(p))
+    log_probs = torch.nn.functional.log_softmax(shift_logits, dim=-1)
+    entropy = -torch.sum(probs * log_probs, dim=-1)  # (batch_size, seq_len-1)


Instead of manually calculating softmax and then the entropy formula, you can use torch.distributions.Categorical for a more concise, efficient, and numerically stable implementation. The entropy() method directly computes what you need.

Suggested change

# Compute probabilities via softmax

probs = torch.nn.functional.softmax(shift_logits, dim=-1)

# Compute entropy: -sum(p * log(p))

log_probs = torch.nn.functional.log_softmax(shift_logits, dim=-1)

entropy = -torch.sum(probs * log_probs, dim=-1) # (batch_size, seq_len-1)

# Compute entropy using torch.distributions

dist = torch.distributions.Categorical(logits=shift_logits)

entropy = dist.entropy() # (batch_size, seq_len-1)

pankd added 2 commits January 5, 2026 09:46

Fix hiyouga#9306: add entropy logging for SFT training path

9c48049

Signed-off-by: Pankaj Dixit <[email protected]>

Fix hiyouga#9306: add entropy logging for SFT training path

42da0ce

Signed-off-by: Pankaj Dixit <[email protected]>

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add entropy logging for SFT training path #9717

Add entropy logging for SFT training path #9717

pankd commented Jan 5, 2026

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	--model_name_or_path ${MODEL_PATH} \
	--model_name_or_path "${MODEL_PATH}" \

	echo "Training completed. Logs are saved to: $OUTPUT/train.log"
	echo "Training completed. Logs are saved to: $OUTPUT/train.log"

Add entropy logging for SFT training path #9717

Are you sure you want to change the base?

Add entropy logging for SFT training path #9717

Conversation

pankd commented Jan 5, 2026

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant