CarML: Reproducible Deep Learning Model Evaluation and Management

A. Dakkak
University of Illinois - Urbana-Champaign, United States

Keywords: AI, Understanding,

The current landscape of Machine Learning (ML) and Deep Learning (DL) is rife with non-uniform models, frameworks, and system stacks. It lacks standard tools and methodologies to evaluate and profile models or systems. Due to the absence of standard tools, the state of the practice for evaluating and comparing the benefits of proposed AI innovations (be it hardware or software) on end-to-end AI pipelines is both arduous and error-prone — stifling the adoption of the innovations in a rapidly moving field. The goal of this presentation is to discuss these challenges and solutions that will help address issues arising from evaluating ML models. The presentation will educate audience on both evaluation scenarios and hardware metrics (such as different evaluation load behaviors, power efficiency, and utilization) that should be captured by benchmarking. The presentation will also educate the attendees on state-of-the-art tools and best practices developed at the IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR), some of which has won Best Research Paper Award at well-known international conferences. The presentation will present ideas and tools used by experts from both industry and academia to discuss how these tools and methodologies can be leveraged for: