par Konstantin TODOROV du LIRMM
Résumé: Entity Alignment (EA) through representation learning aims to map entities from two different Knowledge Graphs (KGs) referring to the same real-world object. The mapping is performed by embedding the entities in a shared space, where the distance between the embeddings serves as a proxy for the similarity between the entities. While many embedding-based models perform well on synthetic benchmark datasets, they often struggle in real-world scenarios due to the heterogenous, incomplete and domain-specific nature of real-world data. Despite efforts to create realistic benchmarks, there has been little comparison of model performance across different by nature datasets. In this talk, we analyse the semantic similarity and the KG profiles to explain the performance drop on real-world data. We test on real-world data models that excel on synthetic datasets, showing that many current methods fail to find correct entity mappings in such scenarios. Furthermore, most models do not consider entities from outside the validation data, limiting their ability to discover new alignments in large-scale KGs. We demonstrate how restricting the search space impacts negatively generalization, underlining the need for more robust solutions for real-world scenarios of EA.