Maximum Entropy Approach based Named Entity Recognition in Punjabi Language

2013 
Named Entity Recognition is the task of identifying and classifying named entities into some predefine categories like person, location, organization etc. NER is used in many applications like text summarization, text classification, question answering and machine translation systems etc. For English a lot of work has already been done in the field of NER, where capitalization is a major key for rules, whereas Indian languages do not have such feature. This makes the task difficult for Indian Languages. This work reports about the evaluation of a Named Entity Recognition (NER) system for Punjabi language using the Maximum Entropy Approach (MAXENT). A manually tagged Punjabi news corpus is used for the evaluation which was developed from Punjabi newspaper available online. The training set annotated with a NE tagset of 12 tags is used. A MAXENT based NER system for Punjabi has reported an overall Precision, Recall and FScore values of 90.92%, 72.30% and 80.55% respectively with feature set context word, Part of Speech (POS) information, NE tag of previous word and First name Gazetteer list.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []