Abstract

Traffic encrypted technology enables Internet users to protect their data secrecy, but it also brings a challenge to malicious package detection. To tackle this issue, researchers have investigated into encrypted traffic analysis (ETA) in recent years. Existing works, however, only focus on the accuracy of malicious flow identification. Using ETA as a technical black box, they pay little attention to the internal details and explanation of models. In this paper, we, for the first time, introduce interpretable machine learning into ETA. We aim to provide a reasonable explanation for detection results, so as to enable one to understand and further trust network security analysts. We develop a complete analysis framework, named DEV-ETA (detection, explanation and verification of ETA). DEV-ETA applies post hoc interpretation methods to explain the detection results and verify the explanation using the joint distribution of support features on the dataset. We run thorough experiments to explain the detection result using three popular explanation approaches, namely SHAP, LIME and MSS, and we verify the explanation via the feature distribution plot. The experimental results show that our design can interpret the detection result of ETA model instead of just simply treating the model as a black box.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
You do not currently have access to this article.