Light-Weight Deep Learning Models for Acoustic Scene Classification Using Teacher-Student Scheme and Multiple Spectrograms

Lam Pham (Vortragende:r), Tin Nguyen, Phat Lam, Dat Ngo, Anahid Naghibzadeh-Jalali, Alexander Schindler

Publikation: Beitrag in Buch oder TagungsbandVortrag mit Beitrag in TagungsbandBegutachtung

Abstract

In this paper, we present a light-weight deep learning based system for acoustic scene classification (ASC), which is armed to be integrated into an Internet of Sound (IoS) system with a limitation of hardware resource. To achieve the light-weight ASC model, we develop a teacher-student deep learning scheme with a two-phase training strategy. In the first phase (Phase I), a Teacher network architecture, which shows a large model footprint, is proposed. After training the Teacher, the embeddings, which are the feature map of the teacher, are extracted. In the second phase (Phase II), we propose Students which presents light-weight network architectures. We train the Students with leveraging embeddings extracted from the Teacher. To further improve the accuracy performance, we apply an ensemble of multiple spectrograms on both the Teacher and Students. Our experiments conducted on DCASE 2023 Task 1 dataset with ten target classes (‘Airport’, ‘Bus’, ‘Metro’, ‘Metro station’, ‘Park’, ‘Public square’, ‘Shopping mall’, ‘Street pedestrian’, ‘Street traffic’, ‘Tram’) helps to achieve the best Student with the accuracy performance of 57.4% on the Development set and 55.6% on the blind Evaluation set, which improve the DCASE baseline by 14.5% and 10.8%, respectively. The best Student also achieves 82.3% with three target classes (‘Indoor’, ‘Outdoor’, and ‘Transportation’) on the Development set and presents a light-weight model of 88.7 KB memory occupation and 29.27 M MACs, which is potential to apply on a wide range of edge devices.
OriginalspracheEnglisch
TitelInternational Symposium on the Internet of Sounds
Seiten1-8
ISBN (elektronisch)979-8-3503-8254-9
DOIs
PublikationsstatusVeröffentlicht - 2023
Veranstaltung2023 4th International Symposium on the Internet of Sounds - Pisa, Pisa, Italien
Dauer: 26 Okt. 202327 Okt. 2023

Konferenz

Konferenz2023 4th International Symposium on the Internet of Sounds
Land/GebietItalien
StadtPisa
Zeitraum26/10/2327/10/23

Research Field

  • Ehemaliges Research Field - Data Science

Fingerprint

Untersuchen Sie die Forschungsthemen von „Light-Weight Deep Learning Models for Acoustic Scene Classification Using Teacher-Student Scheme and Multiple Spectrograms“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren