Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring

Jose Camacho, Katarzyna Wasielewska, Rasmus Bro, David Kotz

Research output: Contribution to journalJournal articleResearchpeer-review

1 Downloads (Pure)

Abstract

There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR’16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth’18, the longest and largest Wi-Fi trace known to date.

Original languageEnglish
JournalIEEE Transactions on Network and Service Management
Volume21
Issue number3
Pages (from-to)2926 - 2943
ISSN1932-4537
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
Authors

Keywords

  • Analytical models
  • Anomaly Detection
  • Big Data
  • Dartmouth Campus Wi-Fi
  • Data models
  • Data visualization
  • Interpretable Machine Learning
  • Monitoring
  • Multivariate Big Data Analysis
  • Network Monitoring
  • Principal component analysis
  • Representation learning
  • UGR’16

Cite this