Skip to content

The game failed to start... OOM #747

@cynodesmus

Description

@cynodesmus

Something changed in the updates after Commit b811368 (this was the version that was installed and working).
I updated to Commit 11cb44f and after several attempts, no generation was successful.
The issue is that the program doesn't crash, but reports an Out of Memory (OOM) error and stops the generation.
Ideally, it should free up VRAM by unloading unused models or switching to tiled VAE. This would reduce speed but ensure the task completes. Instead, it simply refuses to run.
Excerpts from the log:

Log OOMs (Commit 11cb44f) 2026-03-01 20:09:55.405 | INFO | acestep.gpu_config:get_gpu_memory_gb:390 - CUDA GPU detected: NVIDIA GeForce RTX 3070 (8.0 GB)

============================================================
GPU Configuration Detected:

GPU Memory: 8.00 GB
Configuration Tier: tier3
Max Duration (with LM): 480s (8 min)
Max Duration (without LM): 600s (10 min)
Max Batch Size (with LM): 2
Max Batch Size (without LM): 2
Default LM Init: True
Available LM Models: ['acestep-5Hz-lm-0.6B']

Auto-enabling CPU offload (GPU 8.0GB < 20.0GB threshold)

...

2026-03-01 20:16:24.593 | INFO | acestep.core.generation.handler.init_service_loader:_load_main_model_from_checkpoint:55 - [initialize_service] Attempting to load model with attention implementation: flash_attention_2
torch_dtype is deprecated! Use dtype instead!
2026-03-01 20:17:31.558 | INFO | acestep.llm_inference:initialize:548 - loading 5Hz LM tokenizer... it may take 80~90s
2026-03-01 20:18:10.912 | INFO | acestep.llm_inference:initialize:552 - 5Hz LM tokenizer loaded successfully in 39.35 seconds
2026-03-01 20:18:10.913 | INFO | acestep.llm_inference:initialize:557 - Initializing constrained decoding processor...
2026-03-01 20:18:10.913 | INFO | acestep.llm_inference:initialize:563 - Setting constrained decoding max_duration to 480s based on GPU config (tier: tier3)
2026-03-01 20:18:16.158 | DEBUG | acestep.constrained_logits_processor:_precompute_audio_code_tokens:577 - Found 1535 audio code tokens with values outside valid range [0, 63999]
2026-03-01 20:18:24.064 | INFO | acestep.llm_inference:initialize:571 - Constrained processor initialized in 13.15 seconds
2026-03-01 20:18:24.065 | INFO | acestep.gpu_config:get_gpu_memory_gb:390 - CUDA GPU detected: NVIDIA GeForce RTX 3070 (8.0 GB)
2026-03-01 20:18:24.535 | INFO | acestep.gpu_config:get_lm_gpu_memory_ratio:769 - [get_lm_gpu_memory_ratio] model=0.6B, free=6.89GB, current_usage=1.11GB, lm_target=2.10GB, usable_for_lm=2.10GB, ratio=0.401
2026-03-01 20:18:24.535 | INFO | acestep.llm_inference:get_gpu_memory_utilization:172 - Adaptive LM memory allocation: model=С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B, target=1.2GB, ratio=0.401, total_gpu=8.0GB
2026-03-01 20:18:24.536 | INFO | acestep.llm_inference:_initialize_5hz_lm_vllm:707 - Initializing 5Hz LM with model: С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B, enforce_eager: False, tensor_parallel_size: 1, max_model_len: 2048, gpu_memory_utilization: 0.401
[nanovllm] KV cache allocated: 71 blocks × 256 tokens = 18176 tokens capacity, 1.94 GB (free: 5.56 GB, used: 1.26 GB, target: 3.21 GB, block: 28.00 MB, post_kv_free: 3.62 GB)
2026-03-01 20:20:17.223 | INFO | acestep.llm_inference:_initialize_5hz_lm_vllm:717 - 5Hz LM initialized successfully in 112.69 seconds
2026-03-01 20:20:17.224 | INFO | acestep.llm_inference:initialize:649 - 5Hz LM status message: ✅ 5Hz LM initialized successfully
Model: С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B
Device: NVIDIA GeForce RTX 3070
GPU Memory Utilization: 0.401
Low GPU Memory Mode: True
2026-03-01 20:32:31.474 | INFO | acestep.inference:generate_music:404 - [generate_music] LLM usage decision: thinking=True, use_cot_caption=False, use_cot_language=True, use_cot_metas=True, need_lm_for_cot=True, llm_initialized=True, use_lm=True
2026-03-01 20:32:31.475 | INFO | acestep.inference:generate_music:462 - LM chunk 1/1 (infer_type=llm_dit) (size: 2, seeds: [1245495242, 3051524477])
2026-03-01 20:32:31.475 | INFO | acestep.llm_inference:generate_with_stop_condition:1220 - Batch Phase 1: Generating CoT metadata (once for all items)...
2026-03-01 20:32:31.515 | INFO | acestep.llm_inference:generate_with_stop_condition:1228 - generate_with_stop_condition: formatted_prompt=<|im_start|>system

...

Generating: 100%|███████████████████████████████████| 1/1 [00:05<00:00, 5.61s/steps, Prefill=269tok/s, Decode=70tok/s]
2026-03-01 20:32:37.135 | DEBUG | acestep.llm_inference:parse_lm_output:2566 - Debug output text:
bpm: 70
duration: 246
keyscale: E minor
language: unknown
timesignature: 2
<|im_end|>
2026-03-01 20:32:37.136 | INFO | acestep.llm_inference:generate_with_stop_condition:1268 - Batch Phase 1 completed in 5.66s. Generated metadata: ['bpm', 'duration', 'keyscale', 'language', 'timesignature']
2026-03-01 20:32:37.137 | INFO | acestep.llm_inference:generate_with_stop_condition:1311 - Batch Phase 2: Generating audio codes for 2 items...
2026-03-01 20:32:37.143 | INFO | acestep.llm_inference:generate_with_stop_condition:1321 - generate_with_stop_condition: formatted_prompt_with_cot=<|im_start|>system

...

Generating: 100%|██████████████████████████████████| 2/2 [00:23<00:00, 11.52s/steps, Prefill=1253tok/s, Decode=70tok/s]
2026-03-01 20:33:00.185 | DEBUG | acestep.llm_inference:parse_lm_output:2566 - Debug output text: ...<|audio_code_50631|><|im_end|>
2026-03-01 20:33:00.193 | DEBUG | acestep.llm_inference:parse_lm_output:2566 - Debug output text: ...<|im_end|>
2026-03-01 20:33:00.200 | INFO | acestep.llm_inference:generate_with_stop_condition:1413 - Batch Phase 2 completed in 23.06s. Generated codes: [1224, 1225]
2026-03-01 20:33:00.200 | INFO | acestep.core.generation.handler.generate_music:generate_music:157 - [generate_music] Starting generation...
2026-03-01 20:33:00.201 | INFO | acestep.core.generation.handler.generate_music:generate_music:160 - [generate_music] Preparing inputs...
2026-03-01 20:33:00.202 | DEBUG | acestep.core.generation.handler.memory_utils:_vram_guard_reduce_batch:124 - [VRAM guard] offload_to_cpu=True, batch_size=2 <= tier limit 2 — skipping dynamic VRAM check
2026-03-01 20:33:00.252 | INFO | acestep.core.generation.handler.generate_music:_vram_preflight_check:63 - [generate_music] VRAM pre-flight: 3.37 GB free, ~5.42 GB needed (batch=2, duration=246s, mode=base).
2026-03-01 20:33:00.252 | WARNING | acestep.core.generation.handler.generate_music:_vram_preflight_check:78 - [generate_music] VRAM pre-flight failed: Insufficient free VRAM: need ~5.4 GB, only 3.4 GB available. Reduce batch size (currently 2) or audio duration (currently 246s).

Rolled back to Commit b811368 and started the generation (although it is not as fast as desired, at least it doesn't break/abort):

Log OK (Commit b811368) 2026-03-01 21:53:17.690 | INFO | acestep.gpu_config:get_gpu_memory_gb:390 - CUDA GPU detected: NVIDIA GeForce RTX 3070 (8.0 GB)

============================================================
GPU Configuration Detected:

GPU Memory: 8.00 GB
Configuration Tier: tier3
Max Duration (with LM): 480s (8 min)
Max Duration (without LM): 600s (10 min)
Max Batch Size (with LM): 2
Max Batch Size (without LM): 2
Default LM Init: True
Available LM Models: ['acestep-5Hz-lm-0.6B']

Auto-enabling CPU offload (GPU 8.0GB < 20.0GB threshold)

...

2026-03-01 21:53:46.705 | INFO | acestep.core.generation.handler.init_service_loader:_load_main_model_from_checkpoint:55 - [initialize_service] Attempting to load model with attention implementation: flash_attention_2
torch_dtype is deprecated! Use dtype instead!
2026-03-01 21:53:51.651 | INFO | acestep.llm_inference:initialize:548 - loading 5Hz LM tokenizer... it may take 80~90s
2026-03-01 21:54:30.346 | INFO | acestep.llm_inference:initialize:552 - 5Hz LM tokenizer loaded successfully in 38.70 seconds
2026-03-01 21:54:30.347 | INFO | acestep.llm_inference:initialize:557 - Initializing constrained decoding processor...
2026-03-01 21:54:30.353 | INFO | acestep.llm_inference:initialize:563 - Setting constrained decoding max_duration to 480s based on GPU config (tier: tier3)
2026-03-01 21:54:35.727 | DEBUG | acestep.constrained_logits_processor:_precompute_audio_code_tokens:577 - Found 1535 audio code tokens with values outside valid range [0, 63999]
2026-03-01 21:54:43.734 | INFO | acestep.llm_inference:initialize:571 - Constrained processor initialized in 13.38 seconds
2026-03-01 21:54:43.734 | INFO | acestep.gpu_config:get_gpu_memory_gb:390 - CUDA GPU detected: NVIDIA GeForce RTX 3070 (8.0 GB)
2026-03-01 21:54:43.783 | INFO | acestep.gpu_config:get_lm_gpu_memory_ratio:769 - [get_lm_gpu_memory_ratio] model=0.6B, free=6.89GB, current_usage=1.11GB, lm_target=2.10GB, usable_for_lm=2.10GB, ratio=0.401
2026-03-01 21:54:43.784 | INFO | acestep.llm_inference:get_gpu_memory_utilization:172 - Adaptive LM memory allocation: model=С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B, target=1.2GB, ratio=0.401, total_gpu=8.0GB
2026-03-01 21:54:43.785 | INFO | acestep.llm_inference:_initialize_5hz_lm_vllm:697 - Initializing 5Hz LM with model: С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B, enforce_eager: False, tensor_parallel_size: 1, max_model_len: 2048, gpu_memory_utilization: 0.401
[nanovllm] KV cache allocated: 71 blocks × 256 tokens = 18176 tokens capacity, 1.94 GB (free: 5.56 GB, used: 1.26 GB, target: 3.21 GB, block: 28.00 MB, post_kv_free: 3.62 GB)
2026-03-01 21:55:21.609 | INFO | acestep.llm_inference:_initialize_5hz_lm_vllm:707 - 5Hz LM initialized successfully in 37.82 seconds
2026-03-01 21:55:21.609 | INFO | acestep.llm_inference:initialize:639 - 5Hz LM status message: ✅ 5Hz LM initialized successfully
Model: С:\ACE-Step-1.5\checkpoints\acestep-5Hz-lm-0.6B
Device: NVIDIA GeForce RTX 3070
GPU Memory Utilization: 0.401
Low GPU Memory Mode: True
2026-03-01 21:56:55.471 | INFO | acestep.inference:generate_music:404 - [generate_music] LLM usage decision: thinking=True, use_cot_caption=False, use_cot_language=True, use_cot_metas=True, need_lm_for_cot=True, llm_initialized=True, use_lm=True
2026-03-01 21:56:55.471 | INFO | acestep.inference:generate_music:462 - LM chunk 1/1 (infer_type=llm_dit) (size: 2, seeds: [2122992989, 3194684511])
2026-03-01 21:56:55.473 | INFO | acestep.llm_inference:generate_with_stop_condition:1210 - Batch Phase 1: Generating CoT metadata (once for all items)...
2026-03-01 21:56:55.512 | INFO | acestep.llm_inference:generate_with_stop_condition:1218 - generate_with_stop_condition: formatted_prompt=<|im_start|>system

...

Generating: 100%|███████████████████████████████████| 1/1 [00:04<00:00, 4.05s/steps, Prefill=352tok/s, Decode=69tok/s]
2026-03-01 21:56:59.565 | DEBUG | acestep.llm_inference:parse_lm_output:2556 - Debug output text:
bpm: 30
duration: 306
keyscale: D major
language: unknown
timesignature: 4
<|im_end|>
2026-03-01 21:56:59.566 | INFO | acestep.llm_inference:generate_with_stop_condition:1258 - Batch Phase 1 completed in 4.09s. Generated metadata: ['bpm', 'duration', 'keyscale', 'language', 'timesignature']
2026-03-01 21:56:59.567 | INFO | acestep.llm_inference:generate_with_stop_condition:1301 - Batch Phase 2: Generating audio codes for 2 items...
2026-03-01 21:56:59.574 | INFO | acestep.llm_inference:generate_with_stop_condition:1311 - generate_with_stop_condition: formatted_prompt_with_cot=<|im_start|>system

...

Generating: 100%|██████████████████████████████████| 2/2 [00:38<00:00, 19.49s/steps, Prefill=2916tok/s, Decode=21tok/s]
2026-03-01 21:57:38.568 | DEBUG | acestep.llm_inference:parse_lm_output:2556 - Debug output text: ...<|im_end|>
2026-03-01 21:57:38.573 | DEBUG | acestep.llm_inference:parse_lm_output:2556 - Debug output text: ...<|im_end|>
2026-03-01 21:57:38.578 | INFO | acestep.llm_inference:generate_with_stop_condition:1403 - Batch Phase 2 completed in 39.01s. Generated codes: [1524, 1524]
2026-03-01 21:57:38.578 | INFO | acestep.core.generation.handler.generate_music:generate_music:92 - [generate_music] Starting generation...
2026-03-01 21:57:38.578 | INFO | acestep.core.generation.handler.generate_music:generate_music:95 - [generate_music] Preparing inputs...
2026-03-01 21:57:38.582 | DEBUG | acestep.core.generation.handler.memory_utils:_vram_guard_reduce_batch:124 - [VRAM guard] offload_to_cpu=True, batch_size=2 <= tier limit 2 — skipping dynamic VRAM check
2026-03-01 21:57:38.748 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading vae to cuda
2026-03-01 21:57:39.371 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded vae to cuda in 0.6225s
2026-03-01 21:57:39.371 | INFO | acestep.core.generation.handler.conditioning_target:_prepare_target_latents_and_wavs:41 - [generate_music] Decoding audio codes for item 0...
2026-03-01 21:57:39.376 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading model to cuda
loss_type=None was set in the config but it is unrecognized. Using the default loss: ForCausalLMLoss.
2026-03-01 21:57:45.510 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded model to cuda in 6.1342s
2026-03-01 21:59:30.237 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading model to CPU
2026-03-01 21:59:39.467 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded model to CPU in 9.2302s
2026-03-01 21:59:39.479 | INFO | acestep.core.generation.handler.conditioning_target:_prepare_target_latents_and_wavs:41 - [generate_music] Decoding audio codes for item 1...
2026-03-01 21:59:39.481 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading model to cuda
2026-03-01 21:59:42.719 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded model to cuda in 3.2373s
2026-03-01 21:59:42.824 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading model to CPU
2026-03-01 22:00:49.806 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded model to CPU in 66.9815s
2026-03-01 22:00:49.818 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading vae to CPU
2026-03-01 22:00:50.512 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded vae to CPU in 0.6944s
2026-03-01 22:00:50.585 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_precomputed_lm_hints:31 - [generate_music] Decoding audio codes for LM hints for item 0...
2026-03-01 22:00:50.587 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading model to cuda
2026-03-01 22:00:53.662 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded model to cuda in 3.0746s
2026-03-01 22:00:53.816 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading model to CPU
2026-03-01 22:01:05.405 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded model to CPU in 11.5886s
2026-03-01 22:01:05.406 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_precomputed_lm_hints:31 - [generate_music] Decoding audio codes for LM hints for item 1...
2026-03-01 22:01:05.410 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading model to cuda
2026-03-01 22:01:08.710 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded model to cuda in 3.3007s
2026-03-01 22:01:08.808 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading model to CPU
2026-03-01 22:01:20.226 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded model to CPU in 11.4177s
2026-03-01 22:01:20.227 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_text_conditioning_inputs:85 -

2026-03-01 22:01:20.229 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_text_conditioning_inputs:86 - 🔍 [DEBUG] DiT TEXT ENCODER INPUT (Inference)
2026-03-01 22:01:20.230 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_text_conditioning_inputs:87 - ======================================================================
2026-03-01 22:01:20.231 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_text_conditioning_inputs:88 - text_prompt:

...

2026-03-01 22:01:20.233 | INFO | acestep.core.generation.handler.conditioning_text:_prepare_text_conditioning_inputs:91 - ======================================================================

2026-03-01 22:01:20.470 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading vae to cuda
2026-03-01 22:01:21.254 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded vae to cuda in 0.7832s
2026-03-01 22:01:21.436 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading vae to CPU
2026-03-01 22:01:22.119 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded vae to CPU in 0.6824s
2026-03-01 22:01:22.120 | INFO | acestep.core.generation.handler.conditioning_embed:preprocess_batch:110 - [preprocess_batch] Inferring prompt embeddings...
2026-03-01 22:01:22.121 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading text_encoder to cuda
2026-03-01 22:01:23.082 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded text_encoder to cuda in 0.9606s
2026-03-01 22:01:23.631 | INFO | acestep.core.generation.handler.conditioning_embed:preprocess_batch:113 - [preprocess_batch] Inferring lyric embeddings...
2026-03-01 22:01:23.631 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading text_encoder to CPU
2026-03-01 22:01:24.329 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded text_encoder to CPU in 0.6966s
2026-03-01 22:01:24.330 | INFO | acestep.core.generation.handler.service_generate_execute:_execute_service_generate_diffusion:120 - [service_generate] Generating audio... (DiT backend: PyTorch (cuda))
2026-03-01 22:01:24.332 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading model to cuda
2026-03-01 22:01:27.523 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded model to cuda in 3.1904s
Using precomputed LM hints
2026-03-01 22:01:34.238 | INFO | acestep.core.generation.handler.service_generate_execute:_execute_service_generate_diffusion:200 - [service_generate] DiT diffusion via PyTorch (cuda)...
Using precomputed LM hints
2026-03-01 22:06:39.455 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading model to CPU
2026-03-01 22:07:36.767 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded model to CPU in 57.3122s
2026-03-01 22:07:36.768 | INFO | acestep.core.generation.handler.generate_music_decode:_prepare_generate_music_decode_state:41 - [generate_music] Model generation completed. Decoding latents...
2026-03-01 22:07:36.770 | DEBUG | acestep.core.generation.handler.generate_music_decode:_prepare_generate_music_decode_state:63 - [generate_music] pred_latents: torch.Size([2, 7620, 64]), dtype=torch.bfloat16
2026-03-01 22:07:36.770 | DEBUG | acestep.core.generation.handler.generate_music_decode:_prepare_generate_music_decode_state:64 - [generate_music] time_costs: {'encoder_time_cost': 6.773648500442505, 'diffusion_time_cost': 298.44265508651733, 'diffusion_per_step_time_cost': 37.30533188581467, 'total_time_cost': 305.21630358695984, 'offload_time_cost': 179.9070565700531}
2026-03-01 22:07:36.873 | INFO | acestep.core.generation.handler.generate_music_decode:_decode_generate_music_pred_latents:118 - [generate_music] Decoding latents with VAE...
2026-03-01 22:07:36.874 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:39 - [_load_model_context] Loading vae to cuda
2026-03-01 22:07:37.719 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:52 - [_load_model_context] Loaded vae to cuda in 0.8438s
2026-03-01 22:07:37.724 | DEBUG | acestep.core.generation.handler.generate_music_decode:_decode_generate_music_pred_latents:127 - [generate_music] Before VAE decode: allocated=3.55GB, max=8.61GB
2026-03-01 22:07:37.725 | INFO | acestep.core.generation.handler.generate_music_decode:_decode_generate_music_pred_latents:145 - [generate_music] Effective free VRAM before VAE decode: 2.93 GB
2026-03-01 22:07:37.725 | INFO | acestep.core.generation.handler.generate_music_decode:_decode_generate_music_pred_latents:163 - [generate_music] Using tiled VAE decode to reduce VRAM usage...
2026-03-01 22:07:37.726 | DEBUG | acestep.core.generation.handler.memory_utils:_get_auto_decode_chunk_size:75 - [_get_auto_decode_chunk_size] Effective free VRAM: 2.93 GB
2026-03-01 22:07:37.726 | DEBUG | acestep.core.generation.handler.memory_utils:_should_offload_wav_to_cpu:98 - [_should_offload_wav_to_cpu] Effective free VRAM: 2.93 GB
2026-03-01 22:07:37.727 | INFO | acestep.core.generation.handler.vae_decode:tiled_decode:56 - [tiled_decode] chunk_size=128, offload_wav_to_cpu=True, latents_shape=torch.Size([2, 64, 7620])
2026-03-01 22:07:37.727 | INFO | acestep.core.generation.handler.vae_decode_chunks:_tiled_decode_inner:19 - [tiled_decode] Batch size 2 > 1; decoding samples sequentially to save VRAM
2026-03-01 22:07:37.728 | WARNING | acestep.core.generation.handler.vae_decode_chunks:_tiled_decode_inner:35 - [tiled_decode] Reduced overlap from 64 to 32 for chunk_size=128
Decoding audio chunks: 100%|██████████████████████████████████████████████████████| 119/119 [00:09<00:00, 13.07steps/s]
2026-03-01 22:07:47.814 | WARNING | acestep.core.generation.handler.vae_decode_chunks:_tiled_decode_inner:35 - [tiled_decode] Reduced overlap from 64 to 32 for chunk_size=128
Decoding audio chunks: 100%|██████████████████████████████████████████████████████| 119/119 [00:08<00:00, 13.95steps/s]
2026-03-01 22:07:56.461 | DEBUG | acestep.core.generation.handler.generate_music_decode:_decode_generate_music_pred_latents:185 - [generate_music] After VAE decode: allocated=3.70GB, max=8.61GB
2026-03-01 22:07:56.749 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:57 - [_load_model_context] Offloading vae to CPU
2026-03-01 22:07:57.301 | INFO | acestep.core.generation.handler.init_service_offload_context:_load_model_context:67 - [_load_model_context] Offloaded vae to CPU in 0.5521s
2026-03-01 22:07:57.302 | INFO | acestep.core.generation.handler.generate_music_payload:_build_generate_music_success_payload:35 - [generate_music] VAE decode completed. Preparing audio tensors...
2026-03-01 22:07:57.303 | INFO | acestep.core.generation.handler.generate_music_payload:_build_generate_music_success_payload:45 - [generate_music] Done! Generated 2 audio tensors.
2026-03-01 22:07:57.409 | INFO | acestep.inference:generate_music:677 - [Normalization] Audio 0 BEFORE: Peak=1.0000, Target=-1dB
2026-03-01 22:07:57.514 | INFO | acestep.inference:generate_music:682 - [Normalization] Audio 0 AFTER: Peak=0.8913
2026-03-01 22:07:57.559 | INFO | acestep.inference:generate_music:677 - [Normalization] Audio 1 BEFORE: Peak=1.0000, Target=-1dB
2026-03-01 22:07:57.655 | INFO | acestep.inference:generate_music:682 - [Normalization] Audio 1 AFTER: Peak=0.8913
2026-03-01 22:08:00.300 | DEBUG | acestep.audio_utils:save_audio:190 - [AudioSaver] Saved audio to С:\ACE-Step-1.5\gradio_outputs*.flac (flac, 48000Hz)
2026-03-01 22:08:03.289 | DEBUG | acestep.audio_utils:save_audio:190 - [AudioSaver] Saved audio to С:\ACE-Step-1.5\gradio_outputs*
.flac (flac, 48000Hz)
2026-03-01 22:08:03.392 | INFO | acestep.ui.gradio.events.results.generation_progress:generate_with_progress:337 - [generate_with_progress] Audio 1 path: С:/ACE-Step-1.5/gradio_outputs/.flac
2026-03-01 22:08:03.393 | INFO | acestep.ui.gradio.events.results.generation_progress:generate_with_progress:337 - [generate_with_progress] Audio 2 path: С:/ACE-Step-1.5/gradio_outputs/
.flac
2026-03-01 22:08:03.459 | INFO | acestep.ui.gradio.events.results.batch_management_wrapper:generate_with_batch_management:157 - [generate_with_batch_management] Final yield: 46 core + 9 state

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions