CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

wzds2015 · 2024-06-06T23:30:07Z

I got this error when I execute: cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo

Searched online and also tried the following. All got the same error.

cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CUDA_COMPILER=/usr/bin/nvcc
cmake . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CUDA_COMPILER=/usr/bin/nvcc -DCMAKE_CUDA_ARCHITECTURE="90;89;86;80;75;70;61;52"

Python binings also has issue to install, command: pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch got following error:

########################################

            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(72): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [5], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "Grid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(73): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [9], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "HashGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(74): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [10], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "TiledGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(75): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [10], lambda [](uint32_t, const tcnn::json &)->tcnn::GridEncoding<__half> *)
     register_encoding<T>(factories, "DenseGrid", grid_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(77): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [9], lambda [](uint32_t, const tcnn::json &)->tcnn::IdentityEncoding<__half> *)
     register_encoding<T>(factories, "Identity", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(81): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [8], lambda [](uint32_t, const tcnn::json &)->tcnn::OneBlobEncoding<__half> *)
     register_encoding<T>(factories, "OneBlob", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(85): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [19], lambda [](uint32_t, const tcnn::json &)->tcnn::SphericalHarmonicsEncoding<__half> *)
     register_encoding<T>(factories, "SphericalHarmonics", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(89): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [13], lambda [](uint32_t, const tcnn::json &)->tcnn::TriangleWaveEncoding<__half> *)
     register_encoding<T>(factories, "TriangleWave", [](uint32_t n_dims_to_encode, const json& encoding) {
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(94): error: no instance of constructor "tcnn::CompositeEncoding<T>::CompositeEncoding [with T=__half]" matches the argument list
              argument types are: ({...}, uint32_t)
      return new CompositeEncoding<T>{
                                     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(114): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [17], lambda [](uint32_t, const tcnn::json &)-><error-type>)
     register_encoding<T>(factories, "OneBlobFrequency", nrc_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  /tmp/pip-req-build-ehfu9zqt/src/encoding.cu(115): error: no instance of overloaded function "tcnn::register_encoding" matches the argument list
              argument types are: (std::unordered_map<std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>, tcnn::CaseInsensitiveHash, tcnn::CaseInsensitiveEqual, std::allocator<std::pair<const std::string, std::function<tcnn::Encoding<__half> *(uint32_t, const tcnn::json &)>>>>, const char [4], lambda [](uint32_t, const tcnn::json &)-><error-type>)
     register_encoding<T>(factories, "NRC", nrc_factory);
     ^
            detected during:
              instantiation of "auto tcnn::register_builtin_encodings<T>() [with T=__half]" at line 122
              instantiation of "auto &tcnn::encoding_factories<T>() [with T=__half]" at line 135
              instantiation of "tcnn::Encoding<T> *tcnn::create_encoding<T>(uint32_t, const tcnn::json &, uint32_t) [with T=__half]" at line 148
  
  76 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/encoding.cu".
  [7/10] /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o
  /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                   ^
  
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                        ^
  
  /usr/include/c++/11/type_traits(1406): error: identifier "__is_same" is undefined
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                         ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                  ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                       ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                     ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9025): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_array(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                   ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9079): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_object(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
    #pragma GCC optimize("Og")
                ^
  
  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(2835): error: no instance of overloaded function "fmt::v9::detail::bigint::assign" matches the argument list
              argument types are: (uint64_t)
      explicit bigint(uint64_t n) { assign(n); }
                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3253): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint32_t)
          write<char>(buffer_appender<char>(buf), dec.significand);
          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3257): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint64_t)
        write<char>(buffer_appender<char>(buf), dec.significand);
        ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/cuda_graph.h(98): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
      return ScopeGuard{[this, stream]() {
                       ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/gpu_memory.h(474): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
       ScopeGuard revert_device = {[&]() { set_cuda_device(previous_device); }};
                                  ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/networks/cutlass_mlp.h(149): error: no instance of constructor "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]" matches the argument list
              argument types are: ({...}, {...}, {...}, {...}, {...})
      return {
             ^
  
  15 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/cutlass_mlp.cu".
  [8/10] /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o
  /usr/bin/nvcc  -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                   ^
  
  /usr/include/c++/11/type_traits(1406): error: type name is not allowed
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                                        ^
  
  /usr/include/c++/11/type_traits(1406): error: identifier "__is_same" is undefined
        : public integral_constant<bool, __is_same(_Tp, _Up)>
                                         ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                  ^
  
  /usr/include/c++/11/type_traits(3251): error: type name is not allowed
      inline constexpr bool is_same_v = __is_same(_Tp, _Up);
                                                       ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                     ^
  
  /usr/include/c++/11/tuple(1432): error: type name is not allowed
          constexpr bool __found[__sz] = { __is_same(_Tp, _Types) ... };
                                                          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9025): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_array(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                   ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/json/json.hpp(9079): error: no instance of overloaded function "nlohmann::detail::conditional_static_cast" matches the argument list
              argument types are: (uint64_t)
                    return get_number(input_format_t::cbor, len) && get_cbor_object(detail::conditional_static_cast<std::size_t>(len), tag_handler);
                                                                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
    #pragma GCC optimize("Og")
                ^
  
  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(2835): error: no instance of overloaded function "fmt::v9::detail::bigint::assign" matches the argument list
              argument types are: (uint64_t)
      explicit bigint(uint64_t n) { assign(n); }
                                    ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3253): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint32_t)
          write<char>(buffer_appender<char>(buf), dec.significand);
          ^
  
  /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include/fmt/format.h(3257): error: no instance of overloaded function "fmt::v9::detail::write" matches the argument list
              argument types are: (fmt::v9::appender, uint64_t)
        write<char>(buffer_appender<char>(buf), dec.significand);
        ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/cuda_graph.h(98): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
      return ScopeGuard{[this, stream]() {
                       ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/gpu_memory.h(474): error: no instance of constructor "tcnn::ScopeGuard::ScopeGuard" matches the argument list
              argument types are: (lambda []()->void)
       ScopeGuard revert_device = {[&]() { set_cuda_device(previous_device); }};
                                  ^
  
  /tmp/pip-req-build-ehfu9zqt/include/tiny-cuda-nn/networks/fully_fused_mlp.h(138): error: no instance of constructor "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]" matches the argument list
              argument types are: ({...}, {...}, {...}, {...}, {...})
      return {
             ^
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=128]" at line 893
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=64]" at line 894
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(720): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(721): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(722): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(799): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_backward<WIDTH, T, Activation::None>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(800): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_backward<WIDTH, T, Activation::Exponential>(stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(801): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_backward<WIDTH, T, Activation::Sigmoid>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(802): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_backward<WIDTH, T, Activation::ReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(803): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_backward<WIDTH, T, Activation::LeakyReLU>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(804): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_backward<WIDTH, T, Activation::Squareplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(805): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_backward<WIDTH, T, Activation::Softplus>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(806): error: more than one instance of overloaded function "tcnn::mlp_fused_backward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 262)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_backward<WIDTH,T,ACTIVATION>(cudaStream_t, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 276)
              argument types are: (cudaStream_t, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, const tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_backward<WIDTH, T, Activation::Tanh>( stream, input_weight_matrix(use_inference_params), weight_matrix_at(use_inference_params, 0), tmp_dL_doutput, backward_tmp.at(backward_tmp_idx), forward.hidden.at(0), dL_dinput_fused, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::backward_impl(cudaStream_t, const tcnn::Context &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, tcnn::GradientMode) [with T=tcnn::network_precision_t, WIDTH=32]" at line 895
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(689): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(690): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, true>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(691): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(692): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(693): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(694): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Squareplus: mlp_fused_forward<WIDTH, T, Activation::Squareplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                   ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(695): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Softplus: mlp_fused_forward<WIDTH, T, Activation::Softplus, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                                 ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(696): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Tanh: mlp_fused_forward<WIDTH, T, Activation::Tanh, true>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, inference_tmp, &output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "void tcnn::FullyFusedMLP<T, WIDTH>::inference_mixed_precision_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> &, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(715): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::None: mlp_fused_forward<WIDTH, T, Activation::None, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(716): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Exponential: mlp_fused_forward<WIDTH, T, Activation::Exponential, false>(stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                    ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(717): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::Sigmoid: mlp_fused_forward<WIDTH, T, Activation::Sigmoid, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(718): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::ReLU: mlp_fused_forward<WIDTH, T, Activation::ReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                             ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  /tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu(719): error: more than one instance of overloaded function "tcnn::mlp_fused_forward" matches the argument list:
              function template "std::enable_if_t<<expression>, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 560)
              function template "std::enable_if_t<std::is_same<__half, T>::value, void> tcnn::mlp_fused_forward<WIDTH,T,ACTIVATION,INFERENCE>(cudaStream_t, tcnn::Activation, const tcnn::GPUMatrix<T, tcnn::MatrixLayout::RowMajor> &, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrix<T, tcnn::MatrixLayout::ColumnMajor> &, tcnn::GPUMatrixDynamic<T> *, uint32_t)" (declared at line 573)
              argument types are: (cudaStream_t, tcnn::Activation, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::RowMajor>, const tcnn::GPUMatrixDynamic<tcnn::network_precision_t>, tcnn::GPUMatrix<tcnn::network_precision_t, tcnn::MatrixLayout::ColumnMajor>, tcnn::GPUMatrixDynamic<tcnn::network_precision_t> *, uint32_t)
      case Activation::LeakyReLU: mlp_fused_forward<WIDTH, T, Activation::LeakyReLU, false>( stream, m_output_activation, input_weight_matrix(use_inference_params), input, forward->hidden.at(0), output, m_n_hidden_matmuls); break;
                                  ^
            detected during instantiation of "std::unique_ptr<tcnn::Context, std::default_delete<tcnn::Context>> tcnn::FullyFusedMLP<T, WIDTH>::forward_impl(cudaStream_t, const tcnn::GPUMatrixDynamic<T> &, tcnn::GPUMatrixDynamic<T> *, bool, bool) [with T=tcnn::network_precision_t, WIDTH=16]" at line 896
  
  Error limit reached.
  100 errors detected in the compilation of "/tmp/pip-req-build-ehfu9zqt/src/fully_fused_mlp.cu".
  Compilation terminated.
  [9/10] c++ -MMD -MF /tmp/pip-req-build-ehfu9zqt/bindings/torch/dependencies/fmt/src/format.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/dependencies/fmt/src/format.cc -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/dependencies/fmt/src/format.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  [10/10] c++ -MMD -MF /tmp/pip-req-build-ehfu9zqt/bindings/torch/build/temp.linux-x86_64-cpython-310/tinycudann/bindings.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/tmp/pip-req-build-ehfu9zqt/include -I/tmp/pip-req-build-ehfu9zqt/dependencies -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-ehfu9zqt/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp -o /tmp/pip-req-build-ehfu9zqt/bindings/torch/build/temp.linux-x86_64-cpython-310/tinycudann/bindings.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
  In file included from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/Exceptions.h:14,
                   from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include/torch/python.h:11,
                   from /usr/local/lib/python3.10/dist-packages/torch/include/torch/extension.h:9,
                   from /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:34:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::LogSeverity>’:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7:   required from ‘class pybind11::enum_<tcnn::cpp::LogSeverity>’
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:283:52:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::LogSeverity>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
   1496 | class class_ : public detail::generic_type {
        |       ^~~~~~
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Precision>’:
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7:   required from ‘class pybind11::enum_<tcnn::cpp::Precision>’
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:292:48:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Precision>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Context>’:
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:309:45:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Context>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<Module>’:
  /tmp/pip-req-build-ehfu9zqt/bindings/torch/tinycudann/bindings.cpp:316:32:   required from here
  /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<Module>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-req-build-ehfu9zqt/bindings/torch/setup.py", line 189, in <module>
      setup(
    File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 299, in run
      self.run_command('build')
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build.py", line 131, in run
      self.run_command(cmd_name)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 91, in run
      _build_ext.run(self)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
      self.build_extensions()
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 252, in build_extension
      _build_ext.build_extension(self, ext)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tinycudann
Running setup.py clean for tinycudann
Failed to build tinycudann
ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

wzds2015 commented Jun 6, 2024 •

edited

Loading

CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

CUDA_ARCHITECTURES is empty for target "cmTC_cbe81" (Ubuntu cloud server) #444

Comments

wzds2015 commented Jun 6, 2024 • edited Loading

wzds2015 commented Jun 6, 2024 •

edited

Loading