Deployed AI Data Monitoring Systems in Critical Settings

A. Joshi
Johns Hopkins University, Maryland, United States

Keywords: Surveillance, Anomaly, Big Data, AI System

This research builds systems to identify unknown threats in real-time from large-volumes of streaming data with real-world statistical properties. As the volume of real time data streams increase across critical domains like public health, infrastructure, and cybersecurity, the complexity of finding unexpected data, often corresponding to system failures, increases exponentially while human expert or intelligence resources remain static. This work proposes scalable approaches to monitor real-time data that can be high-variance, drift, or context-dependent that leverage human expert data expectations by moving away from an alerting paradigm, that can generate hundreds of thousands of incomparable alerts at scale, to a ranking paradigm, where threats are prioritized based on their statistical properties using extreme value theory. This system has already been deployed in a national public health surveillance setting over 5,000,000 data points daily as part of a multi-year interdisciplinary collaboration. The system has increased human expert efficiency in analyzing unexpected data by over 53x compared to state-of-the-art manual human review. Beyond human-AI data review at scale, these methods are currently being investigated in their ability to identify unexpected hallucinations in foundational models and security applications.