Releases: fschmid56/EfficientAT
Releases · fschmid56/EfficientAT
Pre-Trained Models and PaSST ensemble predictions
In this release, we upload pre-trained models as well as the ensembled PaSST logits we used for Knowledge Distillation.
- passt_enemble_logits_mAP_495.npy: Ensembled Logits of 9 different PaSST models on AudioSet, Ensemble achieves a mAP of .495
- mn<width_mult>_<dataset>: denotes width_mult used to scale the width of MobileNetV3 and the dataset the model was trained on ('as' stands for AudioSet), check out the Readme file for further details
- dymn<width_mult>_<dataset>: denotes width_mult used to scale the width of a dynamic MobileNetV3 and the dataset the model was trained on ('as' stands for AudioSet), check out the Readme file for further details
- fc denotes that the model is trained with a fully-convolutional head
- s<num,num,num,num> denotes models trained with reduced strides; default: 2222
- no_im_pre: no ImageNet pre-training before training on AudioSet
- hop denotes the time resolution of spectrograms that the model is trained on (hop size in milliseconds)
- mels denotes the number of mel bins (frequency resolution of spectrograms) that the model is trained on
- Default: hop=10ms, mels=128 bands
Models are automatically downloaded when argument pretrained_name is set to the correct name.