Automatic Segmentation and Tagging of YouTube Sponsor Segments from Subtitles based on Natural Language Processing (NLP)

Philipp Jonas

Publikation: AbschlussarbeitMasterarbeit

Abstract

The goal of this thesis is the recognition and localization of sponsor segments in videos using textual subtitles. The identification of these segments is interesting for people who want to skip these segments. Furthermore, it can be interesting to label these segments as advertisements if it is legally required. This topic has so far not been covered by scientific literature and at the time of this thesis there is not automated way to recognize sponsor segments in videos. For this purpose, a combination of supervised learning [1] and the application of neural networks [2] was proposed as the solution. This requires training data of the appropriate quality and quantity. Corresponding initial data were identified based on community-reported sponsor segments in YouTube videos. This data was structured and prepared accordingly for further processing.The machine learning model was developed based on a dual-LSTM [3], [4] network and structurally adapted to the training data and desired results. After training the model on the training data, statistically significant classification results were produced and optimized by applying hyperparameter tuning [2], [5]. Executing the optimized model in practice delivered
excellent results with a high hit rate.
OriginalspracheEnglisch
QualifikationMaster of Science
Gradverleihende Hochschule
  • University of Applied Sciences Technikum Wien
Betreuer/-in / Berater/-in
  • Schütz, Mina, Betreuer:in
  • Knapp, Bernhard , Betreuer:in, Externe Person
Datum der Bewilligung5 Juni 2023
PublikationsstatusVeröffentlicht - Juni 2023

Research Field

  • Ehemaliges Research Field - Data Science

Fingerprint

Untersuchen Sie die Forschungsthemen von „Automatic Segmentation and Tagging of YouTube Sponsor Segments from Subtitles based on Natural Language Processing (NLP)“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren