Audio normalization in train/val/test set

Hi @hkchengrex,

Congratulations on your great work! I really enjoyed reading your paper and running your well-organized codebase. I noticed one thing: in the snippet below, it seems that you're **normalizing the audio from the training set but not from the validation set**. I understand why this might be the case for the test set, since it's not used for generation/evaluation anyway, but could you clarify why the validation set is treated differently?

https://github.com/hkchengrex/MMAudio/blob/ec6ab44928dcca2df51e5894140ad140149c3aa3/training/extract_video_training_latents.py#L54-L75

	data_cfg = {
	'example': {
	'root': './training/example_videos',
	'subset_name': './training/example_video.tsv',
	'normalize_audio': True,
	},
	# 'train': {
	# 'root': '../data/video',
	# 'subset_name': './sets/vgg3-train.tsv',
	# 'normalize_audio': True,
	# },
	# 'test': {
	# 'root': '../data/video',
	# 'subset_name': './sets/vgg3-test.tsv',
	# 'normalize_audio': False,
	# },
	# 'val': {
	# 'root': '../data/video',
	# 'subset_name': './sets/vgg3-val.tsv',
	# 'normalize_audio': False,
	# },
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio normalization in train/val/test set #78

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Audio normalization in train/val/test set #78

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions