Organizing the Web's Information Explosion to Discover Unknown Unknowns

2010 
This paper introduces the TORISHIKI-KAI project, which aims to construct a million-word-scale semantic network from the Web using state of the art knowledge acquisition methods. The resulting network can be browsed as a Web search directory, and we show that the directory is useful for finding “unknown unknowns” — in the infamous words of D.H. Rumsfeld: things “we don't know we don't know.” Because typically we have no way to look for information we don't even know is missing, a crucial characteristic of unknown unknowns is that they are very difficult to discover through keyword-based Web search. Some examples of the unknown unknowns we have found include unexpected troubles associated with commercial products, surprising new combinations of ingredients in new recipes, unexpected tools or methods for commiting suicide, and so on. We expect such information to be useful for risk management, innovation support, and the detection of harmful information on the Web.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    13
    Citations
    NaN
    KQI
    []