PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation
Software log analysis helps to maintain the health of software solutions and
ensure compliance and security. Existing software systems consist of
heterogeneous components emitting logs in various formats. A typical solution
is to unify the logs using manually built parsers, which is laborious.
Instead, we explore the possibility of automating the parsing task by
employing machine translation (MT). We create a tool that generates synthetic
Apache log records which we used to train recurrent-neural-network-based MT
models. Models' evaluation on real-world logs shows that the models can learn
Apache log format and parse individual log records. The median relative edit
distance between an actual real-world log record and the MT prediction is less
than or equal to 28%. Thus, we show that log parsing using an MT approach is
promising.