An Explainable Outlier Detection-based Data Cleaning Approach for Intrusion Detection

Theodore Ha, Sicong Shao, Salim Hariri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The effectiveness of machine learning (ML)-based intrusion detection systems (IDSs) for detecting widespread cyberattacks on critical infrastructure and government systems has been demonstrated in recent years. Nevertheless, with ML models becoming more complex, people can hardly understand their decisions. Further, most works on model explanations focus on analyzing the ML model itself. However, data cleaning is also vital in influencing the model's detection behavior. On the other hand, data cleaning for ML-based IDSs is challenging because modern IDS datasets may contain outliers that affect the training stage. In this work, we propose an explainable data cleaning approach for intrusion detection, which can effectively perform explainable isolation forest-based outlier detection in the data preprocessing stage for intrusion detection. Through experiments on real-world network intrusion datasets, we evaluate the effectiveness of our approach. Experiment results demonstrate that eliminating outliers improves intrusion detection and that data cleaning using outlier detection is explainable.

Original languageEnglish (US)
Title of host publication2023 20th ACS/IEEE International Conference on Computer Systems and Applications, AICCSA 2023 - Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9798350319439
DOIs
StatePublished - 2023
Externally publishedYes
Event20th ACS/IEEE International Conference on Computer Systems and Applications, AICCSA 2023 - Giza, Egypt
Duration: Dec 4 2023Dec 7 2023

Publication series

NameProceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA
ISSN (Print)2161-5322
ISSN (Electronic)2161-5330

Conference

Conference20th ACS/IEEE International Conference on Computer Systems and Applications, AICCSA 2023
Country/TerritoryEgypt
CityGiza
Period12/4/2312/7/23

Keywords

  • data cleaning
  • explainable
  • intrusion detection
  • outlier removal

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Signal Processing
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this