Harnessing Big Data and Machine Learning for Event Detection and Localization
Loading...
Authors
Yazdi, Amirhessam
Issue Date
2022
Type
Dissertation
Language
Keywords
Anomalous event prediction , Computer Vision , Deep learning , Storage , Tensor decomposition , Wildfire
Alternative Title
Abstract
Anomalous events are rare and significantly deviate from expected pattern and other data instances, making them hard to predict. Correctly and timely detecting anomalous severe events can help reduce risks and losses. Many anomalous event detection techniques are studied in the literature. Recently, big data and machine learning based techniques have shown a remarkable success in a wide range of fields. It is important to tailor big data and machine learning based techniques for each application; otherwise it may result in expensive computation, slow prediction, false alarms, and improper prediction granularity.First, we aim to address the above challenges by harnessing big data and machine learning techniques for fast and reliable prediction and localization of severe events. Firstly, to improve storage failure prediction, we develop a new lightweight and high performing tensor decomposition-based method, named SEFEE, for storage error forecasting in large-scale enterprise storage systems. SEFEE employs tensor decomposition technique to capture latent spatio-temporal information embedded in storage event logs. By utilizing the latent spatio-temporal information, we can make accurate storage error forecasting without training requirements of typical machine learning techniques. The training-free method allows for live prediction of storage errors and their locations in the storage system based on previous observations that had been used in tensor decomposition pipeline to extract meaningful latent correlations. Moreover, we propose an extension to include severity of the errors as contextual information to improve the accuracy of tensor decomposition which in turn improves the prediction accuracy. We further provide detailed characterization of NetApp dataset to provide additional insight into the dynamics of typical large-scale enterprise storage systems for the community.Next, we focus on another application -- AI-driven Wildfire prediction. Wildfires cause billions of dollars in property damages and loss of lives, with harmful health threats. We aim to correctly detect and localize wildfire events in the early stage and also classify wildfire smoke based on perceived pixel density of camera images. Due to the lack of publicly available dataset for early wildfire smoke detection, we first collect and process images from the AlertWildfire camera network. The images are annotated with bounding boxes and densities for deep learning methods to use. We then adapt a transformer-based end-to-end object detection model for wildfire detection using our dataset. The dataset and detection model together form as a benchmark named the Nevada smoke detection benchmark, or Nemo for short. Nemo is the first open-source benchmark for wildfire smoke detection with the focus of the early incipient stage. We further provide a weakly supervised Nemo version to enable wider support as a benchmark.
Description
Citation
Publisher
License
Creative Commons Attribution 4.0 United States