Skip to content

Conversation

@mandar2812
Copy link
Owner

No description provided.

@mandar2812 mandar2812 requested a review from Copilot July 29, 2025 10:01
@mandar2812 mandar2812 self-assigned this Jul 29, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a complete implementation of pcDEQ (Positively Constrained Deep Equilibrium) Transformers, which includes fixed-point solvers, neural network blocks, layers, and complete network architectures for different configurations.

Key changes:

  • Implementation of fixed-point iteration solver with convergence monitoring
  • Creation of pcDEQ blocks and layers for linear, convolutional, and transformer architectures
  • Development of complete network classes supporting different pcDEQ variants

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.

File Description
src/gpt2/modeling/deq/solvers.py Implements fixed-point iteration solver with error tracking and convergence criteria
src/gpt2/modeling/deq/networks.py Provides complete network architectures for linear, convolutional, and transformer pcDEQ models
src/gpt2/modeling/deq/layers.py Defines base and specialized pcDEQ layers with weight initialization and activation functions
src/gpt2/modeling/deq/blocks.py Implements DEQ fixed-point blocks with custom gradient computation and weight clamping

  - Added type hints.
  - Using softsign activation
  - Clamp on `weight`, no weight norm used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants