Acceleration of Hyper-scale Deep Neural Network Inferences by Massive Synaptic Networks available in 3D NAND Flash memory Chips

S. Yi
Texas A&M University, Texas, United States

Keywords: Deep Learning, Artificial Intelligence, In-Memory-Computing, Vector-Matrix-Multiplication, 3D NAND flash memory

Although resistive random access memory (RRAM)-based hardware has emerged as in-memory-computing accelerator, there are many challenges remaining such as immature processes to scale RRAM arrays up to realize trillions of synapses. Here, we propose to use NAND Flash memory chips with hundred billions of synapses readily available in the market so that full hardware demonstrations become possible using hyper-scale AI architectures such as VGG-16 (15 billion) or GPT3 (175 billion). Based on experimental evaluation and benchmarking of the first Flash-based computing prototype, further modifications in core/peripheral circuitry will follow to maximize the figure of merit far beyond digital accelerators. Intelligent Surveillance Augmentation System (ISAS), as the final deliverable, will incorporate vision sensors such as night vision goggles to enable infrared image inference (object detect/alarm) within the given energy budget (e.g., four AA batteries of 1.5V/1600mAh). If successful, individual marine corps after amphibious operation during the night and of submarine will be supported by ISAS. Proposed technology will better protect our beloved warfighters from enemies and various dangers by maximizing the alert while minimizing cognitive burdens during special missions.