AIT_FHSTP at EXIST 2023 Benchmark: Sexism detection by transfer learning, sentiment and toxicity embeddings and hand-crafted features

Jaqueline Böck (Vortragende:r, eingeladen), Mina Schütz (Vortragende:r, eingeladen), Daria Liakhovets, Nathanya Queby Satriani, Andreas Babic , Djordje Slijepcevic, Matthias Zeppelzauer, Alexander Schindler

Publikation: Beitrag in Buch oder TagungsbandVortrag mit Beitrag in TagungsbandBegutachtung

Abstract

Sexism has become a widespread problem on social media and in online conversations. Therefore, the sEXism Identification in Social neTworks (EXIST) challenge addresses this issue at CLEF in 2023. In this year's version of this international benchmark, the goal is to automatically identify sexism in texts with the help of Natural Language Processing (NLP). The tasks are to determine whether a text is sexist, what the source intention behind it is and which type of sexist category it belongs to. This paper presents the contribution of our team, AIT\_FHSTP, in the EXIST challenge held at CLEF in 2023. We present three approaches to solve the classification tasks of this year's shared task. The baseline for all three approaches is an XLM-RoBERTa model pre-trained with additional datasets and fine-tuned on the EXIST2023 data. For our second and third approach we extracted the fine-tuned embeddings of the model and concatenated them with additional features. On the one hand we added sentiment and toxicity model embeddings and on the other hand we added multiple hand-crafted features and reduced the dimensionality with PCA. Afterwards we used these embeddings as an input for a Random Forest classifier who generated the final predictions. Our approach combining XLM-RoBERTa embeddings with additional crafted features and PCA achieved the 1st rank on the soft-soft evaluation of task 2 (source intention) with Spanish content and the 2nd rank for English content. For task 3 (sexism multilabel categorization), we achieved the 3rd rank in the hard-hard evaluation.
OriginalspracheEnglisch
TitelWorking Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023)
Redakteure/-innenMohammed Aliannejadi, Guglielmo Faggioli, Nicola Ferro, Michalis Vlachos
ErscheinungsortThessaloniki, Greece
Herausgeber (Verlag)CEUR-WS
Seiten 878-890
Seitenumfang13
Band3497
ISBN (elektronisch)ISSN 1613-0073
PublikationsstatusVeröffentlicht - 5 Okt. 2023
Veranstaltung CLEF 2023 Conference and Labs of the Evaluation Forum: Information Access Evaluation meets Multilinguality, Multimodality, and Visualization - Thessaloniki, Thessaloniki, Griechenland
Dauer: 18 Sept. 202321 Sept. 2023

Konferenz

Konferenz CLEF 2023 Conference and Labs of the Evaluation Forum
KurztitelCLEF 2023
Land/GebietGriechenland
StadtThessaloniki
Zeitraum18/09/2321/09/23

Research Field

  • Ehemaliges Research Field - Data Science

Fingerprint

Untersuchen Sie die Forschungsthemen von „AIT_FHSTP at EXIST 2023 Benchmark: Sexism detection by transfer learning, sentiment and toxicity embeddings and hand-crafted features“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren