fix: can use other size #26

steven850 · 2024-06-05T11:08:08Z

You added this fix to be able to use other sizes, what sizes will the system accept?
I cant seem to go above 640, anything above 640 gives this error
768 give this
AssertionError: There are 0 faces in the 0-th frame. Only one face is supported."

tiankuan93 · 2024-06-05T12:12:45Z

You can try the following script. I just provided a simple example, you'd better find some clearer picture and video yourself.

python scripts/extract_kps_sequence_and_audio.py \
    --video_path "./test_samples/short_case/AOC/gt.mp4" \
    --kps_sequence_save_path "./test_samples/short_case/AOC/kps_768.pth" \
    --audio_save_path "./test_samples/short_case/AOC/aud.mp3" \
    --height 768 \
    --width 768

python inference.py \
    --reference_image_path "./test_samples/short_case/AOC/ref_768.png" \
    --audio_path "./test_samples/short_case/AOC/aud.mp3" \
    --kps_path "./test_samples/short_case/AOC/kps_768.pth" \
    --output_path "./output/short_case/talk_AOC_no_retarget_768.mp4" \
    --retarget_strategy "no_retarget" \
    --num_inference_steps 20 \
    --image_width 768 \
    --image_height 768

ref_768.png

steven850 · 2024-06-05T13:03:56Z

That is what I ran, and that is what gives me the error if I go past 640 on the resolution.
Traceback (most recent call last):
File "Z:\vex\scripts\extract_kps_sequence_and_audio.py", line 38, in
assert len(faces) == 1, f'There are {len(faces)} faces in the {frame_idx}-th frame. Only one face is supported.'
AssertionError: There are 0 faces in the 0-th frame. Only one face is supported.

FurkanGozukara · 2024-06-06T00:14:10Z

i generated video even with 768x768 worked for me

but video has only 1 face

#27

steven850 · 2024-06-06T07:23:52Z

I have tried hundreds of combinations now, 640 is the max it will do, anything above that it wont detect faces anymore. Im using the included video, short_case/TYS. Single face. I made some upscaled versions of the video, 1024, 1000, 768, 640 etc.

"python scripts/extract_kps_sequence_and_audio.py --video_path "./test_samples/short_case/tys/gt768.mp4" --kps_sequence_save_path "./test_samples/short_case/tys/kps768.pth" --audio_save_path "./test_samples/short_case/tys/gt.mp3" --height 768 --width 768"

Also tried selecting some of the other video sizes like 512 while specifying 768 to see what would happen. Same error.

If I select the 1024 video, but don't call for a higher resolution it works. so the problem is explicitly tied to calling for a specific resolution.

FurkanGozukara · 2024-06-06T10:48:56Z

here 768x768 i made not upscaled

Biden_Photo_Big_result_0003.mp4

KMiNT21 · 2024-06-11T12:53:12Z

You added this fix to be able to use other sizes, what sizes will the system accept? I cant seem to go above 640, anything above 640 gives this error 768 give this AssertionError: There are 0 faces in the 0-th frame. Only one face is supported."

The same issue happened to me. I just debugged this for hours. :)
What I found is that when we use a 1024x1024 image inside retinaface.py, the face detection model gets BAD RESULTS, with confidence less than the threshold of 0.5. So, some inputs can work, but some do not.

Why? I haven't had time to figure it out yet. But it looks like the problem is with incorrectly using the model_ckpts\insightface_models\models\buffalo_l\det_10g.onnx model, which is 640x640. And if we pass an image size larger than that, some incorrect resizing happens.

FurkanGozukara · 2024-06-11T13:07:47Z

You added this fix to be able to use other sizes, what sizes will the system accept? I cant seem to go above 640, anything above 640 gives this error 768 give this AssertionError: There are 0 faces in the 0-th frame. Only one face is supported."

The same issue happened to me. I just debugged this for hours. :) What I found is that when we use a 1024x1024 image inside retinaface.py, the face detection model gets BAD RESULTS, with confidence less than the threshold of 0.5. So, some inputs can work, but some do not.

Why? I haven't had time to figure it out yet. But it looks like the problem is with incorrectly using the model_ckpts\insightface_models\models\buffalo_l\det_10g.onnx model, which is 640x640. And if we pass an image size larger than that, some incorrect resizing happens.

ah makes sense

my input video was also 512x512

KMiNT21 · 2024-06-12T14:04:55Z

So....

If we don't want to face "zero face" problems, there's an easy fix:

Change app.prepare(ctx_id=0, det_size=(args.image_height, args.image_width)) to app.prepare(ctx_id=0, det_size=(min(args.image_height, 640), min(args.image_width, 640))) (and make the same change inside extract_kps...py).

However, there's not much use for it. Image sizes larger than 768 just produce garbage results. As I can see from the V-Express paper at https://arxiv.org/pdf/2406.02511, the model was trained on 512x512 resolution. And, as far as I can understand, it uses the Stable Diffusion 1.5 model, which is also 512x512 (or 768?).

BUT! We have an extremely time-consuming way to increase quality by processing Video to Video with SUPIR-V0Q. It looks interesting.

And... there might be another way to test – use SUPIR for every Nth frame (dropping some frames) and perform frame interpolation. This way, the animation could be more accurate. But it's just an idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: can use other size #26

fix: can use other size #26

steven850 commented Jun 5, 2024

tiankuan93 commented Jun 5, 2024 •

edited

Loading

steven850 commented Jun 5, 2024 •

edited

Loading

FurkanGozukara commented Jun 6, 2024

steven850 commented Jun 6, 2024 •

edited

Loading

FurkanGozukara commented Jun 6, 2024

KMiNT21 commented Jun 11, 2024

FurkanGozukara commented Jun 11, 2024

KMiNT21 commented Jun 12, 2024 •

edited

Loading

fix: can use other size #26

fix: can use other size #26

Comments

steven850 commented Jun 5, 2024

tiankuan93 commented Jun 5, 2024 • edited Loading

steven850 commented Jun 5, 2024 • edited Loading

FurkanGozukara commented Jun 6, 2024

steven850 commented Jun 6, 2024 • edited Loading

FurkanGozukara commented Jun 6, 2024

KMiNT21 commented Jun 11, 2024

FurkanGozukara commented Jun 11, 2024

KMiNT21 commented Jun 12, 2024 • edited Loading

tiankuan93 commented Jun 5, 2024 •

edited

Loading

steven850 commented Jun 5, 2024 •

edited

Loading

steven850 commented Jun 6, 2024 •

edited

Loading

KMiNT21 commented Jun 12, 2024 •

edited

Loading