Automatically Classify Chinese Judgment Documents Utilizing Machine Learning Algorithms

2017 
In law, a judgment is a decision by a court that resolves a controversy and determines the rights and liabilities of parties in a legal action or proceeding. In 2013, China Judgments Online system was launched officially for record keeping and notification, up to now, over 23 million electronic judgment documents are recorded. The huge amount of judgment documents has witnessed the improvement of judicial justice and openness. Document categorization becomes increasingly important for judgments indexing and further analysis. However, it is almost impossible to categorize them manually due to their large volume and rapid growth. In this paper, we propose a machine learning approach to automatically classify Chinese judgment documents using machine learning algorithms including Naive Bayes (NB), Decision Tree (DT), Random Forest (RF) and Support Vector Machine (SVM). A judgment document is represented as vector space model (VSM) using TF-IDF after words segmentation. To improve performance, we construct a set of judicial stop words. Besides, as TF-IDF generates a high dimensional feature vector, which leads to an extremely high time complexity, we utilize three dimensional reduction methods. Based on 6735 pieces of judgment documents, extensive experiments demonstrate the effectiveness and high classification performance of our proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    2
    Citations
    NaN
    KQI
    []