This repository contains the PyTorch code for our paper Rethinking CNN Models for Audio Classification. The experiments are conducted on the following three datasets which can be downloaded from the links provided:
The preprocessing is done separately to save time during the training of the models.
For ESC-50:
python preprocessing/preprocessingESC.py --csv_file ~/Downloads/ESC-50-master/meta/esc50.csv --data_dir ~/Downloads/ESC-50-master/audio --store_dir ~/Downloads/ESC-50-master/spectrogramsFor UrbanSound8K:
python preprocessing/preprocessingUSC.py --csv_file ~/Downloads/ESC-50-master/meta/esc50.csv --data_dir ~/Downloads/ESC-50-master/data --store_dir ~/Downloads/ESC-50-master/spectrogramsFor GTZAN:
python preprocessing/preprocessingGTZAN.py --data_dir /path/to/audio_data/ --store_dir /path/to/store_spectrograms/ --sampling_rate 22050The configurations for training the models are provided in the config folder. The sample_config.json explains the details of all the variables in the configurations. The command for training is:
python train.py --config_path config/esc_resnet.json