It Can Understand the Logs, Literally

2019 
Workflow reconstruction through logs is crucial for troubleshooting targeted distributed systems. It is also challenging to extract enough information from logs and keep a concise view, which makes manual log analysis hard to practice. However, currently popular tools rely on identifier-based log parsing, leaving a large amount of workflow information unexploited. In this paper, we propose a log extraction approach NLog, which utilizes a natural language processing based approach to obtain the key information from log messages and identify the same object in logs generated by different statements without any domain knowledge. We propose to use keyed message, a new log storage structure to store the parsed logs. We implement NLog and apply it to distributed data analytics frameworks Spark and MapReduce. Evaluation results show that NLog can accurately identify the objects in log messages even without explicit identifiers. By using keyed messages, users can have a concise as well as flexible view of the workflows.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    4
    Citations
    NaN
    KQI
    []