Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Text-based entity matching for entity resolution and data fusion applied to person descriptions

Publikation: Beitrag in Buch oder TagungsbandVortrag mit Beitrag in TagungsbandBegutachtung

Abstract

Text-based entity matching facilitates interoperability between heterogeneous systems by aligning textual person descriptions. We propose an entity matching methodology that integrates rule-based feature extraction, similarity measures, and supervised machine learning classifiers, rigorously evaluated on a person matching problem. We constructed a feature space by extracting domain-specific person attributes from text via a combination of string similarity scores and similarities of inverse document frequency (TF-IDF) embeddings. Next, we evaluated multiple supervised classification models including Multi-Layer Perceptron, Random Forest, and XGBoost, to determine their effectiveness. For evaluation, we created a new domain-specific entity matching dataset named Real Scenario Text-based Person Matching (RSTPM), and assessed the person matching performance of all models in terms of classification metrics and computational cost. In addition, we studied the classification impact of the various features. The proposed approach was shown to achieve an increase of 27.47 percentage points (from 55.41\% to 82.88\%) in F1-Score compared to the baseline and a total Accuracy of 92.14\%, thus demonstrating significant improvements in textual person matching whilst exhibiting a moderate increase in computational demand.
OriginalspracheEnglisch
Titel2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Seiten7080 - 7085
ISBN (elektronisch)979-8-3315-3358-8
PublikationsstatusVeröffentlicht - 28 Jan. 2026
Veranstaltung2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC) - Austria Center Vienna, Vienna, Österreich
Dauer: 5 Okt. 20258 Okt. 2025
https://www.ieeesmc2025.org/

Konferenz

Konferenz2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
KurztitelIEEE SMC 2025
Land/GebietÖsterreich
StadtVienna
Zeitraum5/10/258/10/25
Internetadresse

Research Field

  • Responsive Sensing & Analytics

Fingerprint

Untersuchen Sie die Forschungsthemen von „Text-based entity matching for entity resolution and data fusion applied to person descriptions“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren