Center for Nonlinear Studies

Tuesday, November 15, 202209:00 AM - 10:00 AMCNLS Conference Room (TA-3, Bldg 1690) & Webex
Seminar
Hardware-aware Deep Neural Network Inference Partitioning in Embedded Systems
Fabian KressKarlsruhe Institute of Technology
For many years, Deep Neural Networks (DNNs) have been a major research focus in several areas. This is due to the high level of accuracy achieved by these DNNs and the possibility to deploy them in many different use cases.For instance, DNNs are also used in the Belle II experiment to classify particle tracks in its track-trigger system.The deployment of DNNs in embedded systems imposes new challenges, due to additional constraints regarding performance and energy consumption in the near-sensor compute platforms. In such systems, processing data only in the central compute node is disadvantageous since transmitting raw data from sensors such as cameras needs alarge bandwidth and DNN inference of multiple tasks requires certain performance. Hence, offloading DNN workload to the near-sensor nodes in the system can lead to reduced traffic on the interconnect and improved system performance.In this talk, I will present a simulation toolchain for evaluation of hardware-aware DNN inference partitioning for embedded AI applications developed at KIT. This framework explores efficient workload distribution between nearsensor nodes and a central compute node for DNNs. The simulation toolchain thereby evaluates energy and performance metrics for each reasonable partitioning point of a DNN taking specialized hardware accelerators in thenear-sensor node into account.