The FTB layer have an ability that denoising audio? #34

AnkakeYakisobaTsuyudaku · 2024-12-24T02:55:45Z

I want to have question about encode layer.
I think the FTB block in Encode layer will denoise audio signal according to the encode-decode layer and references. Is this statement is correct.

pokepress · 2024-12-27T04:53:51Z

If you're asking whether the algorithm can be used to remove noise from a signal, based on my experience in my fork where I've been using it to process broadcast radio audio, yes, it can do that. That said, there are a lot of different kinds of noise (in terms of sources and characteristics), so could you be more specific as to which one(s) you're talking about.

AnkakeYakisobaTsuyudaku · 2025-01-10T14:45:39Z

Thanks for replying message and sorry late for reply your advice. I think the degradation pattern of historical sound is partly similar to be AM sound(include noise). I don’t know how radio sound is degraded, but historical audio is noisy(especially, including clicknoise) and narrow band width(about 100Hz to 3kHz). By the way, I have one question about FTB block in this program. In this network (that is not rearranged by you) default boolean datatype of freq_attn (written in aero.py) is false, and FTB is probably not working according to code(line 33 and line 93). Is this opinion true?

AnkakeYakisobaTsuyudaku · 2025-01-15T14:35:24Z

I read your code and md file forked from this repository. I explained in recent comment too, I think the character of degradation in historical audio is similar to AM Radio sound. And according to the README, you succeeded the denoise, especially distorted noise in low frequency band, and band width extension up to 16kHz. What did you do to your train data in preprocess?
Thanks for reading this comment.

pokepress · 2025-01-16T03:53:20Z

If you're asking what I did to generate the test data for radio, I bought some personal AM/FM transmitters, then used them to transmit audio to a variety of radios. I captured both the raw and radio audio to the same recording device simultaneously (this keeps them closely synchronized so you can align them to the sample later and not worry about them drifting apart). I got the actual source audio from the Free Music Archive, Project Gutenberg, and some self-produced audio, where I added a set of tones to the start of each track:

which I burned to a CD, and played the CD through a mixer. Here's a quick & dirty diagram:

AnkakeYakisobaTsuyudaku · 2025-01-16T05:19:44Z

Thank you for explaining the pre-process. I understand the method .
But I wonder another question about restoring audio like AM.
This type of audio have to be compressed dynamic range because we can hear the high-tone sound(ex: clarinet).
I think the non-linear effect (in this case, compression) is not able (or difficult) to improve completely (in this case, restore the dynamics).
How did you train (ex, training paramaters, insert different layer or process)?
I'm going to vary the intensity of compression at different stages of learning.(in the first learning phase, the audios are compressed weakly. And after that, the compression is stronger than first.)(but there is no theorical reason)

pokepress · 2025-01-17T02:07:39Z

My project wasn't really designed to tackle the aspect of dynamic range compression-the goal was really to undo just the fidelity reduction of broadcasting itself, rather than the extensive amount of processing (EQ, compression, etc.) radio stations typically use before transmitting the audio. That said, the FM model does seem to expand the dynamic range somewhat, so I'm guessing the FM transmitters I used do apply some dynamic range changes. As far as matching the volume levels, I used the second set of tones (880 hz) in the waveform shown above to align the volumes of the radio and GT versions of the audio.

AnkakeYakisobaTsuyudaku · 2025-01-17T14:05:13Z

Thank you for replying. I understood. Anyway, I'm going to use your checkpoint restore AM radio sound.Thank you!

AnkakeYakisobaTsuyudaku changed the title ~~The length of output data is different from input data using predict.py~~ The FTB layer have an ability that denoising audio? Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The FTB layer have an ability that denoising audio? #34

The FTB layer have an ability that denoising audio? #34

AnkakeYakisobaTsuyudaku commented Dec 24, 2024 •

edited

Loading

pokepress commented Dec 27, 2024

AnkakeYakisobaTsuyudaku commented Jan 10, 2025 •

edited

Loading

AnkakeYakisobaTsuyudaku commented Jan 15, 2025

pokepress commented Jan 16, 2025

AnkakeYakisobaTsuyudaku commented Jan 16, 2025

pokepress commented Jan 17, 2025

AnkakeYakisobaTsuyudaku commented Jan 17, 2025

The FTB layer have an ability that denoising audio? #34

The FTB layer have an ability that denoising audio? #34

Comments

AnkakeYakisobaTsuyudaku commented Dec 24, 2024 • edited Loading

pokepress commented Dec 27, 2024

AnkakeYakisobaTsuyudaku commented Jan 10, 2025 • edited Loading

AnkakeYakisobaTsuyudaku commented Jan 15, 2025

pokepress commented Jan 16, 2025

AnkakeYakisobaTsuyudaku commented Jan 16, 2025

pokepress commented Jan 17, 2025

AnkakeYakisobaTsuyudaku commented Jan 17, 2025

AnkakeYakisobaTsuyudaku commented Dec 24, 2024 •

edited

Loading

AnkakeYakisobaTsuyudaku commented Jan 10, 2025 •

edited

Loading