Skip to content
This repository has been archived by the owner on Aug 7, 2024. It is now read-only.

Thread the scaling type argument throughout fp8 #301

Open
wants to merge 9 commits into
base: gh/drisspg/1/base
Choose a base branch
from

Commits on Jul 3, 2024

  1. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    9db7cdc View commit details
    Browse the repository at this point in the history
  2. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    4fcd497 View commit details
    Browse the repository at this point in the history
  3. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    f7a67bb View commit details
    Browse the repository at this point in the history
  4. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    e9b5ab8 View commit details
    Browse the repository at this point in the history
  5. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    a4c98c5 View commit details
    Browse the repository at this point in the history
  6. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    4e7184b View commit details
    Browse the repository at this point in the history
  7. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    381018d View commit details
    Browse the repository at this point in the history
  8. Update

    [ghstack-poisoned]
    drisspg committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    8404bf6 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. Update on "Thread the scaling type argument throughout fp8"

    # Summary
    
    This PR adds a ScalingGranularity Enum, and threads it though the stack to all the places we call 'tensor_to_amax" and tensor_to_scale.
     - Currently hardcodes TensroWise.Scaling in Float8Linear, Float8DynamicLinear, Float8InferenceLinear. Asserts that granularity is TensorWise for now.
     -  Added this as a property of WeightWithDynamicFloat8CastTensor, since we need to know a prior how do do the scaling for fp8 comms.  
    
    
    
    ### Testing
    
    ``` Shell
    ============================================================================= test session starts =============================================================================
    platform linux -- Python 3.12.4, pytest-7.4.0, pluggy-1.5.0
    rootdir: /home/drisspg/meta/float8_experimental
    plugins: hypothesis-6.104.1
    collected 9 items                                                                                                                                                             
    
    test/test_fsdp2/test_fsdp2_eager.py .........                                                                                                                           [100%]
    
    ============================================================================= 9 passed in 30.77s ==============================================================================
    all tests successful
    
    ```
    
    
    
    
    
    [ghstack-poisoned]
    drisspg committed Jul 17, 2024
    Configuration menu
    Copy the full SHA
    d763faf View commit details
    Browse the repository at this point in the history