Enhancing Information Retrieval using Concept- Based Mining Model with Feature Extraction and Clustering

2013 
Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying text mining model should indicate terms that capture the semantics of text. In this case, the mining model can capture terms that present the concepts of the sentence, which leads to discovery of the topic of the document. Now a day‟s all the information‟s are available with clear diagrammatic explanation or with related images. An image examination method can automate the recognition of landmarks and events in large image collections, significantly getting better Content utilization experience. The wide adoption of photo sharing applications and the enormous amounts of user-generated content uploaded to them raises an information overload issue for users. The concept-based mining model can effectively discriminate between non important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed mining model consists of sentence-based concept analysis, document-based concept analysis, corpus-based concept-analysis, and concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. An Automated Content Organization technique to defeat such an overload is to collect images into groups based on their similarity and then use the derived clusters to support navigation and browsing of the collection. In this paper, we present a community detection (i.e. graph-based clustering) approach that makes use of both visual and tagging features of images in order to efficiently extract groups of correlated images within large image collections. We perform clustering on such image similarity graphs by means of community detection, a process that identifies on the graph groups of nodes that are more closely associated to each other. We categorize the resultant image clusters as landmarks or events by use of features related to the temporal, community, and label characteristics of image clusters. Keywords— landmarks and events, Content Organization technique, similarity, clusters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    3
    Citations
    NaN
    KQI
    []