Skip to content

Training improvements: LoRA tuning, early stopping, loss normalization#23

Merged
Sudhendra merged 7 commits intomainfrom
feat/training-improvements
Feb 16, 2026
Merged

Training improvements: LoRA tuning, early stopping, loss normalization#23
Sudhendra merged 7 commits intomainfrom
feat/training-improvements

Conversation

@Sudhendra
Copy link
Owner

Summary

Fixes severe overfitting observed in Qwen3-8B training (val loss diverged from ~94 to ~130 by epoch 3). Changes based on deep research into Qwen3 LoRA best practices.

  • LoRA config tuned: r=64→16, alpha=128→32, dropout=0→0.05 (matches Axolotl Qwen3 config, QLoRA paper recommendations)
  • Early stopping added: patience=5 evals, threshold=0.01 — stops training when val loss plateaus
  • Loss normalization: Training/validation loss now reported as per-token cross-entropy (~2-5 range) instead of sum-reduced (~100-400 range)
  • Mid-epoch eval enabled: eval_interval_steps=250 (was 0/disabled!)
  • Tinker visualizer: New scripts/visualize_tinker_training.py parses metrics.jsonl and plots loss curves with epoch boundaries, best val loss, and early stopping markers

Files Changed

File Change
configs/training.yaml LoRA params, epochs=2, eval_interval, early stopping config, updated cost estimates
src/training/train_tinker.py Early stopping logic, loss normalization, _check_early_stopping helper
scripts/visualize_tinker_training.py New Tinker training curve visualizer
tests/test_train_tinker.py 9 new tests for early stopping + loss normalization
tests/test_visualize_tinker_training.py 12 new tests for metrics parser

Test Plan

  • All 242 tests pass
  • Lint clean (ruff check)
  • Visualizer generates plot from existing training run
  • Run new training with these settings and verify early stopping triggers appropriately

- LoRA: rank 64→16, alpha 128→32, dropout 0→0.05 (research-backed)
- Epochs: 3→2 with early stopping (patience=5, threshold=0.01)
- Enable mid-epoch validation every 250 steps (was disabled)
- Add _check_early_stopping pure helper with proper edge case handling
- Check early stopping at both mid-epoch and epoch-end validation
- Log early stopping events to train.log and metrics.jsonl
- _iter_training_batches returns (batch, total_tokens, completion_tokens)
- Training loss divided by completion token count for interpretable values
- Validation loss accumulated as total_loss/total_completion_tokens
- metrics.jsonl logs both train_loss (per-token) and train_loss_total (raw)
- final_loss and logger.info use per-token values
- Parses metrics.jsonl (train, val, early_stop entry types)
- Two-subplot layout: loss curves + throughput
- EMA smoothing for noisy train loss (configurable alpha)
- Marks epoch boundaries, best val loss, early stopping
- Handles both old (sum-reduced) and new (per-token) formats
- CLI: --metrics, --output, --dpi, --ema-alpha
- 12 parser tests
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76afa9afec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Fix mypy type error in step_loss_per_token assignment
- Guard best_val_loss lookup against missing exact match (P1)
- Persist early stopping state across checkpoint resumes (P2)
@Sudhendra Sudhendra merged commit 937f575 into main Feb 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments