It's All In The Name: Why Some URLs Are More Vulnerable To Typosquatting

Rashid Tahir University of Illinois at Urbana Champaign, USA
Ali Raza LUMS, Pakistan
Faizan Ahmad FAST National University Lahore, Pakistan
Jehangir Kazi Lahore University of Management Sciences, Pakistan
Fareed Zaffar LUMS, Pakistan
Chris Kanich University of Illinois at Chicago, USA
Matthew Caesar University of Illinois at Urbana-Champaign, USA


Typosquatting is a blackhat practice that relies on human error and low-cost domain registrations to hijack legitimate traffic from well-established websites. The technique is typically used for phishing, driving traffic towards competitors or disseminating indecent or malicious content and as such remains a concern for businesses. We take a fresh new look at this well-studied phenomenon to explore why some URLs are more vulnerable to typing mistakes than others. We explore the relationship between human hand anatomy, keyboard layouts and typing mistakes using various URL datasets. We create an extensive user-centric typographical model and compute a Hardness Quotient (likelihood of mistyping) for each URL using a quantitative measure for finger and hand effort. Furthermore, our model predicts the most likely typos for each URL which can then be defensively registered. Cross-validation against actual URL and DNS datasets suggests that this is a meaningful and effective defense mechanism.

