Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable donated_buffer for all ops's backward benchmarking #104

Closed
wants to merge 6 commits into from

Conversation

FindHao
Copy link
Member

@FindHao FindHao commented Dec 7, 2024

It is still a temporary fix for backward benchmarking. Related discussion #40

@@ -39,6 +39,8 @@
tqdm = None

logger = logging.getLogger(__name__)
# TODO: remove this once we have a better way to handle backward benchmarking
torch._functorch.config.donated_buffer = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems an overkill to me. Can we disable it only in backward and forward_backward?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 20b138d

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@FindHao FindHao force-pushed the findhao/disable_donated_buffer branch from 5217014 to 5aa17b6 Compare December 9, 2024 18:20
@FindHao
Copy link
Member Author

FindHao commented Dec 9, 2024

It looks the ci errors are true, and the config is not compatible with flash_attention. I'm going to set this config for fused_linear_cross_entropy, geglu, and swiglu only.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao merged this pull request in 642ac1e.

@FindHao FindHao mentioned this pull request Dec 11, 2024
facebook-github-bot pushed a commit that referenced this pull request Dec 12, 2024
Summary:
The previous PR #104 causes the following issue.
```
% python run.py --op geglu --mode fwd  --precision fp32 --metrics latency,speedup --csv --cudagraph
  0%|                                                                                                                           | 0/4 [00:03<?, ?it/s]
Caught exception, terminating early with partial results
Traceback (most recent call last):
  File "/scratch/yhao/pta/tritonbench/tritonbench/utils/triton_op.py", line 782, in run
    y_vals: Dict[str, BenchmarkOperatorMetrics] = functools.reduce(
  File "/scratch/yhao/pta/tritonbench/tritonbench/utils/triton_op.py", line 770, in _reduce_benchmarks
    acc[bm_name] = self._do_bench(
  File "/scratch/yhao/pta/tritonbench/tritonbench/utils/triton_op.py", line 981, in _do_bench
    fn = self._get_bm_func(fn_name)
  File "/scratch/yhao/pta/tritonbench/tritonbench/utils/triton_op.py", line 667, in _get_bm_func
    fwd_fn = fwd_fn_lambda(*self.example_inputs)
  File "/scratch/yhao/pta/tritonbench/tritonbench/utils/triton_op.py", line 481, in _inner
    return function(self, *args, **kwargs)
  File "/scratch/yhao/pta/tritonbench/tritonbench/operators/geglu/operator.py", line 69, in inductor_geglu
    compiled = torch.compile(self.baseline_model)
UnboundLocalError: local variable 'torch' referenced before assignment
(B, T, H)
```
we should use `from torch._functorch import config` rather than `import torch._functorch.config`

Pull Request resolved: #113

Reviewed By: adamomainz

Differential Revision: D67110110

Pulled By: FindHao

fbshipit-source-id: e5143b06d0e62fb2a7b83464e23126e73a52ee10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants