A Fault Analysis Method Based on Text Clustering

2020 
A large number of typical fault cases accumulated in the informatization work of State Grid Corporation of China are mostly descriptive text data, which is difficult to understand and analyze by means of automation. In view of this problem, text mining technology is used to extract fault problems and causes from fault cases to form the causal relationship of faults, so as to provide necessary conditions for the next step of fault text mining. This system uses the method of text clustering for fault location and auxiliary research. First of all,do the segmentation of fault information and processing scheme, in this step, the Chinese word segmentation is carried out by using the Jieba word segmentation tool. Secondly, it is necessary to clean the segmentation results and build a corpus. Thirdly, in order to represent the corpus as the type that the computer can calculate the similarity, we need to transform the corpus into frequency matrix. And then instead of using traditional k-means clustering algorithm to cluster, we use the calinski_harabaz score to evaluate the best value of K. Finally, we put this model into application in actual production, build the fault information and solution mapping table.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []