AI-driven data pipelines for real-time offloading and analytics applications

A. Verma, S. Kumara
The Pennsylvania State University, Pennsylvania, United States

Poster stand number: T108

Keywords: Data pipeline, real-time, offloading, edge-cloud, machine learning, big data

Sensor-based big data is increasingly offering real-time visibility into a variety of industrial machines and embedded systems. However, the volume, velocity, and variety of this data will soon start overwhelming existing approaches of building engineering pipelines used to collect, transmit, store, and compute on this data. In this work, we present a novel data pipeline for inferring machine operational status and performing a variety of downstream tasks using much fewer data, in near real-time. Most of the ‘data pipeline’ deployments involve remote use-cases, making them compute and power constrained, requiring the edge tasks to be as lightweight as possible. These environments also have poor connectivity or limited bandwidth motivating data-efficient digital twins for offshore wind turbines or gas turbines operating at 35000 ft. Some implementations of our technology allow for massive compute offloading to the cloud, which is critical for real-time edge applications, such as drones and UAVs. Another use case is the ‘DARPA Ocean of Things’ vision, where the idea is to use distributed sensors to improve local information gathering from a variety of sensors with application areas of interest to commercial, military, and academic use-cases. Other applications include ocean awareness, remote environmental monitoring, and space-based monitoring.