SSIE: An Automatic Data Extractor for Sports Management in Athletics Modality

2015 
Sports management concerns the organization of sport results and modalities information and statistical analysis by professionals. However, these information scattered around the web or organized by sport events which difficult the prospection of sport talents and the textual information are unstructured or semi-structured. This work proposes a Summary Sport Information Extraction System (SSIE) to generate a summary of statistics of the athletics modality by the automatic information extraction of documents retrieved from web. These documents are converted in textual information and classified using Naive Bayes learning method, according to sport type. After the documents retrieval and classification, text segmentation/tokenization, corpus annotation and entity/subset recognition by chunking were used to generate data frames in parse trees structure. The parse trees information are stored in a database, from which was possible to summary projection and big data analyzing over the web. The main contribution of this work was the clustering of huge amount of data spread on the web, useful for sports management.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []