perf: preallocate tensor in semantic text generation #366

no2chem · 2023-06-21T17:10:46Z

This PR modifies the generate_text_semantic so that it preallocates a tensor and fills it instead of using cat, which results in extra allocations and overhead. It also removes the del line and lets the garbage collector manage things, hopefully async.

Overall, on a H100, with some of the other patches, this lets me generate the example prompt:

"Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe. "

Slightly faster than realtime. On average, this prompt generates a 12-13s audio clip, and I can generate the clip in around 8-12s. On a good run that's approximately 130% realtime.

I'd note that the performance of the semantic model seems to be especially bimodal - sometimes I get lucky and get > 270it/s, which takes 2s, and other times its slow and does ~150it/s and takes 4s. It'd be nice to eliminate this variance, though I wonder what if it has to do with the model.

…tions

Ph0rk0z · 2023-07-05T11:58:52Z

I merged all these speedups and they seem to have helped. Bark still only uses 30% of my gpu. There has to be some kind of bottleneck or something. I get 60/70 It/s on the first part with 3090.. but the 2nd part where I think it runs coarse and fine it still only does 1.xx it/s

Mradr · 2023-07-10T15:07:13Z

Using the small model I can reach up to 140-150 - but 270 on the large seems crazy o.o I dont seem to get anywhere near that even with some with this PR vs an H100 using a 3090 as well.

no2chem force-pushed the preallocateTensor branch from e523c2e to 5456c7f Compare June 21, 2023 18:01

perf: preallocate tensor in semantic text generation to reduce alloca…

4f36747

…tions

no2chem force-pushed the preallocateTensor branch from 5456c7f to 4f36747 Compare June 21, 2023 18:32

no2chem mentioned this pull request Jun 22, 2023

perf: disable kvcache for semantic by default #368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: preallocate tensor in semantic text generation #366

perf: preallocate tensor in semantic text generation #366

no2chem commented Jun 21, 2023 •

edited

Loading

Ph0rk0z commented Jul 5, 2023

Mradr commented Jul 10, 2023 •

edited

Loading

perf: preallocate tensor in semantic text generation #366

Are you sure you want to change the base?

perf: preallocate tensor in semantic text generation #366

Conversation

no2chem commented Jun 21, 2023 • edited Loading

Ph0rk0z commented Jul 5, 2023

Mradr commented Jul 10, 2023 • edited Loading

no2chem commented Jun 21, 2023 •

edited

Loading

Mradr commented Jul 10, 2023 •

edited

Loading