Skip to content

Latest commit

 

History

History
24 lines (16 loc) · 947 Bytes

description.md

File metadata and controls

24 lines (16 loc) · 947 Bytes

RGT2

Gerard Roma, Owen Green, Pierre Alexandre Tremblay University of Huddersfield [email protected]

Additional Info

  • is_blind: no
  • additional_training_data: no

Supplementary Material

Method

This system employs a Convolutional Neural Network with fully-connected output layers. The input of the network is a slice of 11 STFT frames (about 200ms). The output is a binary mask corresponding to one spectral frame. We trained the network by optimizing the negative log likelihood loss from a 2D softmax output layer. The target vector was encoded with class labels corresponding to the source with highest magnitude for each time-frequency bin.

References

  • G. Roma, O. Green, P.A. Tremblay, Improving single-network single-channel separation of musical audio with convolutional layers. Proceedings of LVA/ICA, 2018