Learning Desired Behavior for Causality Analysis

Hendrik Leidinger

Failure Analysis enjoys a high economic interest, as any kind of failure may result in incalculable risks and costs. It is concerned with the collection and analysis of data for detecting causes of failures within a system. A widespread technique is fault tree analysis, which enables a graphical representation of possible failures and their causes. However, fault trees still do not save the system designer the tedious work of searching for the actual cause. Most approaches require full manual analysis of a misbehavior based on logs and system designs. Previous approaches to the automation of such failure analyses have not yet been able to spread across the industry as they typically focus on very specific problems. We present an approach to support the user in the search for the failure cause of a system run with the help of machine learning. The analysis is based on the log files generated by the monitored system. Our approach is now structured in three steps. First the log is scanned for segments where abnormal behavior occurs. A machine-learned classifier then determines the type of the anomaly. Finally, we generate a regressor for each anomaly type, which estimates the level of severity of the respective anomaly. If a system run fails, we search for combinations of anomalies whose summed severity exceed a certain threshold. These combinations are presented to the user as possible causes of failure. In concrete terms, we validate our approach on a quadcopter which is an unmanned aerial vehicle having gained increasing attention in the more recent past. We utilize a state-of-the-art simulation tool to generate data for training and evaluation.

Master Thesis.

(pdf)