Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Absolute fastest inference speed #574

Open
SinanAkkoyun opened this issue Aug 17, 2023 · 15 comments
Open

Absolute fastest inference speed #574

SinanAkkoyun opened this issue Aug 17, 2023 · 15 comments

Comments

@SinanAkkoyun
Copy link

Hello! Thank you so much for all the great work.
I would really love to break free from the 11labs API, the only thing tortoise does not have is the sub 1 second inference speed.

With deepspeed, half precision, kv_cache, 1 candidate and one sentence prompt, the best I could get out of it is 2.4 seconds (without warmup, 4 seconds). The deepspeed promises a 10x speedup, is that relative to base performance or just not applicable for one sentence and is that result expected for a 3090?

I would love to know how to further increase inference speed to sub 1 second performance, thank you :)

@SinanAkkoyun
Copy link
Author

From tutorials I've seen that adding a new preset with setting the num_autoregressive_samples for finetuned voices yields much greater speeds with acceptable quality. However, it seems to not generate clip results:

model loaded
Generating autoregressive samples..
0it [00:00, ?it/s]
Computing best candidates using CLVP
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/do_tts.py", line 46, in <module>
    gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/api.py", line 347, in tts_with_preset
    return self.tts(text, **settings)
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/api.py", line 490, in tts
    clip_results = torch.cat(clip_results, dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors

If nothing else can speed it up, I would be very glad about any help regarding lowering the samples

@ADD-eNavarro
Copy link

What bout this? Some polish students have improved Tortoise speed by distilling the models into one and then distilling that one even more, but I can't find anything related to this. Could be interesting to contact them and try their distilled model, maybe?

@SinanAkkoyun
Copy link
Author

That's awesome, thank you so so much! I will try to contact them and hope to maybe achieve distillation myself, but I am not proficient enough to do so myself and would still love to have some easy hyperparameter tuning for speeding tortoise up even more :)

@SpaceCowboy850
Copy link

Please update if you find a way to improve inference speed. Quality is great, but speed is definitely a problem

@MarkMLCode
Copy link

If you want to improve inference speed a bit, I created a fork that allows for a slight speedup in exchange for using more memory. It probably won't bring you under 1 sec, but I've seen speeds of about 1.3 sec on ultra-fast (I do have a 4090 tho). You just need to use 'device_only=True' when creating the TTS object.

#628

@manmay-nakhashi
Copy link
Collaborator

Hey , did you check out the new api_fast ?

@ekarmazin
Copy link

@manmay-nakhashi I have tried it and got this error:

ValueError: The following `model_kwargs` are not used by the model: ['cond_free_k', 'diffusion_temperature', 'diffusion_iterations'] (note: typos in the generate arguments will also show up in this list)

Any suggestions?

@manmay-nakhashi
Copy link
Collaborator

Don't pass diffusion related args as it's not using diffusion

@ekarmazin
Copy link

I am not passing those, but using the tts_with_preset which seems like do that: https://github.com/neonbjb/tortoise-tts/blob/80f89987a5abda5e2b082618cd74f9c7411141dc/tortoise/api_fast.py#L257C9-L257C24

So I am using it like:

from tortoise.api_fast import TextToSpeech

# Initialize the TextToSpeech object
tts = TextToSpeech(kv_cache=True, use_deepspeed=True, half=True)

# Create an in-memory buffer to hold the WAV file data
buffer = io.BytesIO()

# Initialize the WAV file
wf = wave.open(buffer, 'wb')
wf.setnchannels(1)  # Mono
wf.setsampwidth(2)  # 16-bit audio
wf.setframerate(24000)  # Sample rate

for audio_frame in tts.tts_with_preset(
        text_chunk,
        voice_samples=voice_samples,
        preset="ultra_fast",
):
    if audio_frame is not None:
        audio_np = audio_frame.cpu().detach().numpy()
        audio_int16 = (audio_np * 32767).astype(np.int16)
        wf.writeframes(audio_int16.tobytes())
    else:
        logging.warning("No audio generated for the text chunk.")

@manmay-nakhashi
Copy link
Collaborator

no need to use presets over here as there are different configurations for speed vs. quality balance.

@ekarmazin
Copy link

Got it. Yeah switched to tts_stream and it is super fast! Thank you!

@eschmidbauer
Copy link

I'm still not able to get deepspeed to work.
I get this error

[8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
FAILED: pt_binding.o
c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/string_view.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/StringUtil.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/Exception.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/Device.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/impl/InlineDeviceGuard.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/DeviceGuard.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAStream.h:8,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:5:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/C++17.h:27:2: error: #error You need C++17 to compile PyTorch
   27 | #error You need C++17 to compile PyTorch
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
    4 | #error C++17 or later compatible compiler is required to use PyTorch.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:9,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
    4 | #error C++17 or later compatible compiler is required to use ATen.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue.h:1499,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List_inl.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List.h:490,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef_inl.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef.h:632,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/WrapDimUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/TensorNames.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/NamedTensorUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/variable.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h: In lambda function:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:30: error: ‘is_convertible_v’ is not a member of ‘std’; did you mean ‘is_convertible’?
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                              ^~~~~~~~~~~~~~~~
      |                              is_convertible
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:91: error: expected ‘(’ before ‘,’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                           ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:111: error: expected primary-expression before ‘>’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                               ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:112: error: expected primary-expression before ‘)’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                                ^
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction_impl.h:1,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction.h:251,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/op_registration/op_registration.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/library.h:68,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd_not_implemented_fallback.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h: In static member function ‘static Result c10::impl::BoxedKernelWrapper<Result(Args ...), typename std::enable_if<((c10::guts::conjunction<c10::guts::disjunction<std::is_constructible<c10::IValue, typename std::decay<Args>::type>, std::is_same<c10::TensorOptions, typename std::decay<Args>::type> >...>::value && c10::guts::conjunction<c10::guts::disjunction<c10::impl::has_ivalue_to<T, void>, std::is_same<void, ReturnType> >, c10::guts::negation<std::is_lvalue_reference<_Tp> > >::value) && (! c10::impl::is_tuple_of_mutable_tensor_refs<Result>::value)), void>::type>::call(const c10::BoxedKernel&, const c10::OperatorHandle&, c10::DispatchKeySet, Args ...)’:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:25: error: ‘is_same_v’ is not a member of ‘std’; did you mean ‘is_same’?
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                         ^~~~~~~~~
      |                         is_same
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected primary-expression before ‘void’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                                   ^~~~
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected ‘)’ before ‘void’
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:18: note: to match this ‘(’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                  ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/tortoise-tts/tortoise/do_tts.py", line 31, in <module>
    tts = TextToSpeech(models_dir=args.model_dir, use_deepspeed=args.use_deepspeed, kv_cache=args.kv_cache, half=args.half)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tortoise-tts/tortoise/api.py", line 218, in __init__
    self.autoregressive.post_init_gpt2_config(use_deepspeed=use_deepspeed, kv_cache=kv_cache, half=self.half)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/tortoise_tts-3.0.0-py3.11.egg/tortoise/models/autoregressive.py", line 381, in post_init_gpt2_config
    self.ds_engine = deepspeed.init_inference(model=self.inference_model,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 136, in __init__
    self._apply_injection_policy(config)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 363, in _apply_injection_policy
    replace_transformer_layer(client_module,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 534, in replace_transformer_layer
    replaced_module = replace_module(model=model,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 799, in replace_module
    replaced_module, _ = _replace_module(model, policy)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 816, in _replace_module
    replaced_module = policies[child.__class__][0](child,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 524, in replace_fn
    new_module = replace_with_policy(child,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_with_policy
    _container.create_module()
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/containers/gpt2.py", line 16, in create_module
    self.module = DeepSpeedGPTInference(_config, mp_group=self.mp_group)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_gpt.py", line 18, in __init__
    super().__init__(config,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
                            ^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 485, in load
    return self.jit_load(verbose)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 520, in jit_load
    op_module = load(
                ^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

These are the exact steps im following:

conda create --name tortoise python=3.11
conda activate tortoise
	
conda install -c "nvidia/label/cuda-12.1.0" \
	cuda cuda-toolkit cuda-compiler

git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
pip install -r requirements.txt
python setup.py install

python tortoise/do_tts.py \
	--text "This is a test of the initial setup. This is only a test." \
	--use_deepspeed true \
	--voice random --preset fast

@UltramanKuz
Copy link

I'm still not able to get deepspeed to work. I get this error

[8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
FAILED: pt_binding.o
c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/string_view.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/StringUtil.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/Exception.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/Device.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/impl/InlineDeviceGuard.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/DeviceGuard.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAStream.h:8,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:5:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/C++17.h:27:2: error: #error You need C++17 to compile PyTorch
   27 | #error You need C++17 to compile PyTorch
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
    4 | #error C++17 or later compatible compiler is required to use PyTorch.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:9,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
    4 | #error C++17 or later compatible compiler is required to use ATen.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue.h:1499,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List_inl.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List.h:490,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef_inl.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef.h:632,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/WrapDimUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/TensorNames.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/NamedTensorUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/variable.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h: In lambda function:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:30: error: ‘is_convertible_v’ is not a member of ‘std’; did you mean ‘is_convertible’?
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                              ^~~~~~~~~~~~~~~~
      |                              is_convertible
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:91: error: expected ‘(’ before ‘,’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                           ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:111: error: expected primary-expression before ‘>’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                               ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:112: error: expected primary-expression before ‘)’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                                ^
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction_impl.h:1,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction.h:251,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/op_registration/op_registration.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/library.h:68,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd_not_implemented_fallback.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h: In static member function ‘static Result c10::impl::BoxedKernelWrapper<Result(Args ...), typename std::enable_if<((c10::guts::conjunction<c10::guts::disjunction<std::is_constructible<c10::IValue, typename std::decay<Args>::type>, std::is_same<c10::TensorOptions, typename std::decay<Args>::type> >...>::value && c10::guts::conjunction<c10::guts::disjunction<c10::impl::has_ivalue_to<T, void>, std::is_same<void, ReturnType> >, c10::guts::negation<std::is_lvalue_reference<_Tp> > >::value) && (! c10::impl::is_tuple_of_mutable_tensor_refs<Result>::value)), void>::type>::call(const c10::BoxedKernel&, const c10::OperatorHandle&, c10::DispatchKeySet, Args ...)’:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:25: error: ‘is_same_v’ is not a member of ‘std’; did you mean ‘is_same’?
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                         ^~~~~~~~~
      |                         is_same
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected primary-expression before ‘void’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                                   ^~~~
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected ‘)’ before ‘void’
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:18: note: to match this ‘(’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                  ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/tortoise-tts/tortoise/do_tts.py", line 31, in <module>
    tts = TextToSpeech(models_dir=args.model_dir, use_deepspeed=args.use_deepspeed, kv_cache=args.kv_cache, half=args.half)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tortoise-tts/tortoise/api.py", line 218, in __init__
    self.autoregressive.post_init_gpt2_config(use_deepspeed=use_deepspeed, kv_cache=kv_cache, half=self.half)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/tortoise_tts-3.0.0-py3.11.egg/tortoise/models/autoregressive.py", line 381, in post_init_gpt2_config
    self.ds_engine = deepspeed.init_inference(model=self.inference_model,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 136, in __init__
    self._apply_injection_policy(config)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 363, in _apply_injection_policy
    replace_transformer_layer(client_module,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 534, in replace_transformer_layer
    replaced_module = replace_module(model=model,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 799, in replace_module
    replaced_module, _ = _replace_module(model, policy)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 816, in _replace_module
    replaced_module = policies[child.__class__][0](child,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 524, in replace_fn
    new_module = replace_with_policy(child,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_with_policy
    _container.create_module()
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/containers/gpt2.py", line 16, in create_module
    self.module = DeepSpeedGPTInference(_config, mp_group=self.mp_group)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_gpt.py", line 18, in __init__
    super().__init__(config,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
                            ^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 485, in load
    return self.jit_load(verbose)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 520, in jit_load
    op_module = load(
                ^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

These are the exact steps im following:

conda create --name tortoise python=3.11
conda activate tortoise
	
conda install -c "nvidia/label/cuda-12.1.0" \
	cuda cuda-toolkit cuda-compiler

git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
pip install -r requirements.txt
python setup.py install

python tortoise/do_tts.py \
	--text "This is a test of the initial setup. This is only a test." \
	--use_deepspeed true \
	--voice random --preset fast

I meet the same issue

@manmay-nakhashi
Copy link
Collaborator

manmay-nakhashi commented Jan 2, 2024

Run it without Deepspeed, deepspeed works well with cuda 11.8. cuda should be compiled with nvcc.

@jason-shen
Copy link

Interesting to know using tts_stream, what sort of result did you get? Whats the process ratio? Thanks advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants