Light-Weight Deep Learning Models for Acoustic Scene Classification Using Teacher-Student Scheme and Multiple Spectrograms

Lam Pham (Speaker), Tin Nguyen, Phat Lam, Dat Ngo, Anahid Naghibzadeh-Jalali, Alexander Schindler

Research output: Chapter in Book or Conference ProceedingsConference Proceedings with Oral Presentationpeer-review

Abstract

In this paper, we present a light-weight deep learning based system for acoustic scene classification (ASC), which is armed to be integrated into an Internet of Sound (IoS) system with a limitation of hardware resource. To achieve the light-weight ASC model, we develop a teacher-student deep learning scheme with a two-phase training strategy. In the first phase (Phase I), a Teacher network architecture, which shows a large model footprint, is proposed. After training the Teacher, the embeddings, which are the feature map of the teacher, are extracted. In the second phase (Phase II), we propose Students which presents light-weight network architectures. We train the Students with leveraging embeddings extracted from the Teacher. To further improve the accuracy performance, we apply an ensemble of multiple spectrograms on both the Teacher and Students. Our experiments conducted on DCASE 2023 Task 1 dataset with ten target classes (‘Airport’, ‘Bus’, ‘Metro’, ‘Metro station’, ‘Park’, ‘Public square’, ‘Shopping mall’, ‘Street pedestrian’, ‘Street traffic’, ‘Tram’) helps to achieve the best Student with the accuracy performance of 57.4% on the Development set and 55.6% on the blind Evaluation set, which improve the DCASE baseline by 14.5% and 10.8%, respectively. The best Student also achieves 82.3% with three target classes (‘Indoor’, ‘Outdoor’, and ‘Transportation’) on the Development set and presents a light-weight model of 88.7 KB memory occupation and 29.27 M MACs, which is potential to apply on a wide range of edge devices.
Original languageEnglish
Title of host publicationInternational Symposium on the Internet of Sounds
Pages1-8
ISBN (Electronic)979-8-3503-8254-9
DOIs
Publication statusPublished - 2023
Event2023 4th International Symposium on the Internet of Sounds - Pisa, Pisa, Italy
Duration: 26 Oct 202327 Oct 2023

Conference

Conference2023 4th International Symposium on the Internet of Sounds
Country/TerritoryItaly
CityPisa
Period26/10/2327/10/23

Research Field

  • Former Research Field - Data Science

Fingerprint

Dive into the research topics of 'Light-Weight Deep Learning Models for Acoustic Scene Classification Using Teacher-Student Scheme and Multiple Spectrograms'. Together they form a unique fingerprint.

Cite this