Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach: Use Case of Riot or Violent Context Detection

Lam Pham (Autor:in und Vortragende:r), Tin Nguyen, Phat Lam, Hieu Tang, Alexander Schindler

Publikation: Beitrag in Buch oder TagungsbandBeitrag in Tagungsband mit PosterpräsentationBegutachtung

Abstract

In this paper, we present a toolchain for a comprehensive audio/video analysis by leveraging deep learning based multimodal approach. To this end, different specific tasks of Speech to Text (S2T), Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), Visual Object Detection (VOD), Image Captioning (IC), and Video Captioning (VC) are conducted and integrated into the toolchain. By combining individual tasks and analyzing both audio & visual data extracted from input video, the toolchain offers various audio/video-based applications: Two general applications of audio/video clustering, comprehensive audio/video summary and a specific application of riot or violent context detection. Furthermore, the toolchain presents a flexible and adaptable architecture that is effective to integrate new models for further audio/video-based applications.
OriginalspracheEnglisch
Titel21st International Conference on Content-based Multimedia Indexing (CBMI)
Seiten349-352
Seitenumfang4
ISBN (elektronisch)979-8-3503-7844-3
DOIs
PublikationsstatusVeröffentlicht - Feb. 2025
Veranstaltung21st International Conference on Content-based Multimedia Indexing - Reykjavik University (RU), Reykjavik, Island
Dauer: 18 Sept. 201720 Sept. 2024
https://cbmi2024.org/

Konferenz

Konferenz21st International Conference on Content-based Multimedia Indexing
KurztitelCBMI 2024
Land/GebietIsland
StadtReykjavik
Zeitraum18/09/1720/09/24
Internetadresse

Research Field

  • Multimodal Analytics

Fingerprint

Untersuchen Sie die Forschungsthemen von „Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach: Use Case of Riot or Violent Context Detection“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren