Phishing website detection using URL-assisted brand name weighting system

2014 
In this paper, we propose an anti-phishing technique to safeguard users against phishing attacks in the internet. The scope of our study focuses primarily on the detection of phishing websites with English content. In order to convince users on whom the website claims to be, phishers normally place brand names in different parts of the URL. We exploit this phishing pattern by assigning weights to words extracted from the HTML content, based on their co-appearance at hostname, path and filenames of URLs. These weights are then added to their corresponding TF-IDF weights. The most probable words are selected and submitted to Yahoo Search to retrieve the highest frequency domain name among the top 30 search results. A WHOIS lookup is conducted to reveal the owner behind the selected domain name. A phishing website can be easily distinguished if the owner of query domain name differs from the owner of domain name returned by the search engine. Experiments conducted over a dataset of phishing and legitimate websites achieves a true positive rate of 98.2% while maintaining a false positive rate of 5.9%. Our findings prove that brand names in HTML content are very effective in detecting phishing websites.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    16
    Citations
    NaN
    KQI
    []