Abstract
In today's distributed information systems, a large amount of monitoring data such as log files have been collected. These monitoring data at various points of a distributed information system provide unparallel opportunities for us to characterize and track the information system via effectively correlating all monitoring data across the distributed system. Jiang1 proposed a concept named flow intensity to measure the intensity with which the monitoring data reacts to the volume of different user requests. The Autoregressive model with exogenous inputs (ARX) was used to quantify the relationship between each pair of flow intensity measured at various points across distributed systems. If such relationships hold all the time, they are considered as invariants of the underlying systems. Such invariants have been successfully used to characterize complex systems and support various system management tasks, such as system fault detection and localization. However, it is very time-consuming to search the complete set of invariants of large scale systems and existing algorithms are not scalable for thousands of flow intensity measurements. To this end, in this paper, we develop effective pruning techniques based on the identified upper bounds. Accordingly, two efficient algorithms are proposed to search the complete set of invariants based on the pruning techniques. Finally we demonstrate the efficiency and effectiveness of our algorithms with both real-world and synthetic data sets.
Original language | English (US) |
---|---|
Article number | 6729596 |
Pages (from-to) | 1049-1054 |
Number of pages | 6 |
Journal | Proceedings - IEEE International Conference on Data Mining, ICDM |
DOIs | |
State | Published - 2013 |
Externally published | Yes |
Event | 13th IEEE International Conference on Data Mining, ICDM 2013 - Dallas, TX, United States Duration: Dec 7 2013 → Dec 10 2013 |
Keywords
- ARX Model
- AutoRegressive Model
- Efficient Search
- Invariant
ASJC Scopus subject areas
- General Engineering