Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

Open
wzds2015 opened this issue Jun 6, 2024 · 0 comments

Comments

@wzds2015
Copy link

wzds2015 commented Jun 6, 2024

I got this error when I execute: cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo

Searched online and also tried the following. All got the same error.

  1. cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CUDA_COMPILER=/usr/bin/nvcc
  2. cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CUDA_COMPILER=/usr/bin/nvcc -DCMAKE_CUDA_ARCHITECTURE="90;89;86;80;75;70;61;52"

Python binings also has issue to install, command: pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch got following error:

########################################

            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(72): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [5], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "Grid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(73): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [9], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "HashGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(74): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [10], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "TiledGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(75): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [10], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "DenseGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(77): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [9], lambda [](uint32_t, const tcnn::json &)->tcnn::IdentityEncoding<__half> *)
     register_encoding<T>(factories, "Identity", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(81): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [8], lambda [](uint32_t, const tcnn::json &)->tcnn::OneBlobEncoding<__half> *)
     register_encoding<T>(factories, "OneBlob", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(85): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [19], lambda [](uint32_t, const tcnn::json &)->tcnn::SphericalHarmonicsEncoding<__half> *)
     register_encoding<T>(factories, "SphericalHarmonics", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(89): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [13], lambda [](uint32_t, const tcnn::json &)->tcnn::TriangleWaveEncoding<__half> *)
     register_encoding<T>(factories, "TriangleWave", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(94): error: no instance of constructor "tcnn::CompositeEncoding<T>::CompositeEncoding [with T=__half]" matches the argument list
              argument types are: ({...}, uint32_t)
      return new CompositeEncoding<T>{
                                     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(114): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [17], lambda [](uint32_t, const tcnn::json &)-><error-type>)
     register_encoding<T>(factories, "OneBlobFrequency", nrc_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(115): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [4], lambda [](uint32_t, const tcnn::json &)-><error-type>)
     register_encoding<T>(factories, "NRC", nrc_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  76 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/encoding.cu".
  [7/10] /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o
  /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                   ^
  
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                        ^
  
  /usr/include/c++/11/type_traits(1406): error: identifier "__is_same" is undefined
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                         ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                  ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                       ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                     ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9025): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_array(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                   ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9079): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_object(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
    #pragma GCC optimize("Og")
                ^
  
  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(2835): error: no instance of overloaded function "fmt::v9::detail::bigint::assign" matches the argument list
              argument types are: (uint64_t)
      explicit bigint(uint64_t n) { assign(n); }
                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3253): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint32_t)
          write<char>(buffer_appender<char>(buf), dec.significand);
          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3257): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint64_t)
        write<char>(buffer_appender<char>(buf), dec.significand);
        ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/cuda_graph.h(98): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
      return ScopeGuard{[this, stream]() {
                       ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/gpu_memory.h(474): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
       ScopeGuard revert_device = {[&]() { set_cuda_device(previous_device); }};
                                  ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/networks/cutlass_mlp.h(149): error: no instance of constructor "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]" matches the argument list
              argument types are: ({...}, {...}, {...}, {...}, {...})
      return {
             ^
  
  15 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu".
  [8/10] /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o
  /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                   ^
  
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                        ^
  
  /usr/include/c++/11/type_traits(1406): error: identifier "__is_same" is undefined
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                         ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                  ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                       ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                     ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9025): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_array(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                   ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9079): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_object(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
    #pragma GCC optimize("Og")
                ^
  
  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(2835): error: no instance of overloaded function "fmt::v9::detail::bigint::assign" matches the argument list
              argument types are: (uint64_t)
      explicit bigint(uint64_t n) { assign(n); }
                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3253): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint32_t)
          write<char>(buffer_appender<char>(buf), dec.significand);
          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3257): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint64_t)
        write<char>(buffer_appender<char>(buf), dec.significand);
        ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/cuda_graph.h(98): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
      return ScopeGuard{[this, stream]() {
                       ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/gpu_memory.h(474): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
       ScopeGuard revert_device = {[&]() { set_cuda_device(previous_device); }};
                                  ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/networks/fully_fused_mlp.h(138): error: no instance of constructor "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]" matches the argument list
              argument types are: ({...}, {...}, {...}, {...}, {...})
      return {
             ^
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  Error limit reached.
  100 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu".
  Compilation terminated.
  [9/10] c++ -MMD -MF /tmp/pip-req-build-ehfu9zqt/bindings/torch/dependencies/fmt/src/format.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/src/format.cc -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/dependencies/fmt/src/format.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  [10/10] c++ -MMD -MF /tmp/pip-req-build-ehfu9zqt/bindings/torch/build/temp.linux-x86_64-cpython-310/tinycudann/bindings.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/build/temp.linux-x86_64-cpython-310/tinycudann/bindings.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  In file included from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/Exceptions.h:14,
                   from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include/torch/python.h:11,
                   from /usr/local/lib/python3.10/dist-packages/torch/include/torch/extension.h:9,
                   from /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:34:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::LogSeverity>’:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7:   required from ‘class pybind11::enum_<tcnn::cpp::LogSeverity>’
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:283:52:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::LogSeverity>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
   1496 | class class_ : public detail::generic_type {
        |       ^~~~~~
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Precision>’:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7:   required from ‘class pybind11::enum_<tcnn::cpp::Precision>’
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:292:48:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Precision>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Context>’:
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:309:45:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Context>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<Module>’:
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:316:32:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<Module>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-req-build-ehfu9zqt/bindings/torch/setup.py", line 189, in <module>
      setup(
    File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 299, in run
      self.run_command('build')
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build.py", line 131, in run
      self.run_command(cmd_name)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 91, in run
      _build_ext.run(self)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
      self.build_extensions()
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 252, in build_extension
      _build_ext.build_extension(self, ext)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tinycudann
Running setup.py clean for tinycudann
Failed to build tinycudann
ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant