Towards an NLP-based log template generation algorithm for system log analysis

2014 
System log from network equipment is one of the most important information for network management. Sophisticated log message mining could help in investigating a huge number of log messages for trouble shooting, especially in recent complicated network structure (e.g., virtualized networks). However, generating log templates (i.e., meta format) from real log messages (instances) is still difficult problem in terms of accuracy. In this paper we propose a Natural Language Processing (NLP) approach to generate log templates from log messages produced by network equipment in order to overcome this problem. The key idea of the work is to leverage the use of Conditional Random Fields (CRF), a well-studied supervised natural language processing technique. As preliminarily evaluation, with one month network equipment logs in a Japanese academic network, we show that our CRF based algorithm improves the accuracy of generated log templates in reasonable processing time, compared with a traditional method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    23
    Citations
    NaN
    KQI
    []