"torch_compile" still depends on Triton functionality

I attempted to use `linear_cross_entropy` with my GTX 1070 (old, I know), but it fails with the following errors. These errors were unexpected, because the description for "torch_compile" mode seems to suggest that Triton support is not necessary:

_"torch_compile = A highly optimized torch.compile implementation. This is typically the fastest but uses the most amount of memory. Good as a reference and for systems that don't support Triton."_

I can run the same code on CPU, without any issues. It is only when I try to use my GPU that I get the following errors. 

Just reporting this issue for posterity. I'm not really expecting a solution, but if there is an easy fix... I would greatly appreciate one!
```
  File "/home/crow/repos/praxis/praxis/losses/cut_cross_entropy.py", line 15, in forward
    loss = linear_cross_entropy(
        embeddings,
    ...<5 lines>...
        reduction="mean",
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/cut_cross_entropy/linear_cross_entropy.py", line 104, in linear_cross_entropy
    return torch_compile_linear_cross_entropy(
        e, c, targets, bias, ignore_index, softcap, reduction, shift
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/cut_cross_entropy/torch_compile.py", line 66, in torch_compile_linear_cross_entropy
    loss = torch_compile_linear_cross_entropy_apply(
        e,
    ...<5 lines>...
        reduction=reduction,
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
    return fn(*args, **kwargs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 1380, in __call__
    return self._torchdynamo_orig_callable(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        frame, cache_entry, self.hooks, frame_state, skip=1
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 547, in __call__
    return _compile(
        frame.f_code,
    ...<14 lines>...
        skip=skip + 1,
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 986, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 715, in compile_inner
    return _compile_inner(code, one_graph, hooks, transform)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_utils_internal.py", line 95, in wrapper_function
    return function(*args, **kwargs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 750, in _compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/bytecode_transformation.py", line 1361, in transform_code_object
    transformations(instructions, code_options)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 231, in _fn
    return fn(*args, **kwargs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/convert_frame.py", line 662, in transform
    tracer.run()
    ~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/symbolic_convert.py", line 2868, in run
    super().run()
    ~~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
    while self.step():
          ~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
    self.dispatch_table[inst.opcode](self, inst)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/symbolic_convert.py", line 3048, in RETURN_VALUE
    self._return(inst)
    ~~~~~~~~~~~~^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/symbolic_convert.py", line 3033, in _return
    self.output.compile_subgraph(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        self,
        ^^^^^
    ...<2 lines>...
        ),
        ^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/output_graph.py", line 1101, in compile_subgraph
    self.compile_and_call_fx_graph(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        tx, list(reversed(stack_values)), root, output_replacements
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/output_graph.py", line 1382, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/output_graph.py", line 1432, in call_user_compiler
    return self._call_user_compiler(gm)
           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/output_graph.py", line 1483, in _call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
        e.__traceback__
    ) from None
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/output_graph.py", line 1462, in _call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/repro/after_dynamo.py", line 130, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/repro/after_dynamo.py", line 130, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/__init__.py", line 2340, in __call__
    return compile_fx(model_, inputs_, config_patches=self.config)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 1863, in compile_fx
    return aot_autograd(
           ~~~~~~~~~~~~~
    ...<6 lines>...
        cudagraphs=cudagraphs,
        ~~~~~~~~~~~~~~~~~~~~~~
    )(model_, example_inputs_)
    ~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/backends/common.py", line 83, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/aot_autograd.py", line 1155, in aot_module_simplified
    compiled_fn = dispatch_and_compile()
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/aot_autograd.py", line 1131, in dispatch_and_compile
    compiled_fn, _ = create_aot_dispatcher_function(
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        functional_call,
        ^^^^^^^^^^^^^^^^
    ...<3 lines>...
        shape_env,
        ^^^^^^^^^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/aot_autograd.py", line 580, in create_aot_dispatcher_function
    return _create_aot_dispatcher_function(
        flat_fn, fake_flat_args, aot_config, fake_mode, shape_env
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/aot_autograd.py", line 830, in _create_aot_dispatcher_function
    compiled_fn, fw_metadata = compiler_fn(
                               ~~~~~~~~~~~^
        flat_fn,
        ^^^^^^^^
    ...<2 lines>...
        fw_metadata=fw_metadata,
        ^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 678, in aot_dispatch_autograd
    compiled_fw_func = aot_config.fw_compiler(fw_module, adjusted_flat_args)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_functorch/aot_autograd.py", line 489, in __call__
    return self.compiler_fn(gm, example_inputs)
           ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 1741, in fw_compiler_base
    return inner_compile(
        gm,
    ...<5 lines>...
        boxed_forward_device_index=forward_device,
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 569, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        gm,
        ^^^
        example_inputs,
        ^^^^^^^^^^^^^^^
        **kwargs,
        ^^^^^^^^^
    )
    ^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_dynamo/repro/after_aot.py", line 102, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 685, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
        gm, example_inputs, inputs_to_check, **graph_kwargs
    )
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 1129, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/compile_fx.py", line 1044, in codegen_and_compile
    compiled_fn = graph.compile_to_module().call
                  ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/graph.py", line 2027, in compile_to_module
    return self._compile_to_module()
           ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/graph.py", line 2033, in _compile_to_module
    self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
                                                             ~~~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/graph.py", line 1964, in codegen
    self.scheduler = Scheduler(self.operations)
                     ~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 1798, in __init__
    self._init(nodes)
    ~~~~~~~~~~^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 1816, in _init
    self.nodes = [self.create_scheduler_node(n) for n in nodes]
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 1947, in create_scheduler_node
    return SchedulerNode(self, node)
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 893, in __init__
    self._compute_attrs()
    ~~~~~~~~~~~~~~~~~~~^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 907, in _compute_attrs
    group_fn = self.scheduler.get_backend(device).group_fn
               ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 3441, in get_backend
    self.backends[device] = self.create_backend(device)
                            ~~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/home/crow/repos/praxis/.venv/lib/python3.13/site-packages/torch/_inductor/scheduler.py", line 3428, in create_backend
    raise RuntimeError(
        f"Found {device_props.name} which is too old to be supported by the triton GPU compiler, which is used as the backend. Triton only supports devices of CUDA Capability >= 7.0, but your device is of CUDA capability {device_props.major}.{device_props.minor}"  # noqa: B950
    )
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
RuntimeError: Found NVIDIA GeForce GTX 1070 which is too old to be supported by the triton GPU compiler, which is used as the backend. Triton only supports devices of CUDA Capability >= 7.0, but your device is of CUDA capability 6.1

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"torch_compile" still depends on Triton functionality #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"torch_compile" still depends on Triton functionality #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions