MWPD2020: Semantic Web challenge on Mining the Web of HTML-embedded product data

2020 
This paper gives an overview of the Semantic Web Challenge on Mining the Web of HTML-embedded Product Data (MWPD2020) which has been conducted as part of the International Semantic Web Conference (ISWC2020). The challenge consists of two tasks: product matching and product classification. In the first task, participants need to identify offers for the same product originating from different websites. The goal of the second task is to categorize offers from different websites into the GS1 GPC product hierarchy. Six teams from the USA, China, Japan, and Germany participated in the challenge. The winning system in Task 1, PMap, achieved an F1 score of 86.05 using an ensemble of transformer-based language models. Task 2 was won by team Rhinobird achieving a weighted average F1 score of 88.62 using a BERT-based ensemble which considers the dependencies among different category levels.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    3
    Citations
    NaN
    KQI
    []