Semantic Based Highly Accurate Autonomous Decentralized URL Classification System for Web Filtering

2015 
Currently cyberspace has got about one billion registered websites, and it is imperative to accurately categorize voluminous number of website/URLs for the purpose of URL filtering and marketing segmentation. This paper presents autonomous decentralized semantic based large-scale URL/web classification system for web filtering using Yago2s and DS-onto knowledgebase. As many predefined categories are highly overlapping or semantically similar, proposed word sense disambiguation algorithm along with inference engine design brings high accuracy for classification of URLs in to 120 different categories. Evaluation results show that it achieves 90-93% of accuracy which is much higher than that obtained by currently used URL classification systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    6
    Citations
    NaN
    KQI
    []