Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Executive Committee 
 Postdocs 
 Visitors 
 Students 
 Research 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 P/T Colloquia 
 Archive 
 Ulam Scholar 
 
 Postdoc Nominations 
 Student Requests 
 Student Program 
 Visitor Requests 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Tuesday, November 15, 2022
09:00 AM - 10:00 AM
CNLS Conference Room (TA-3, Bldg 1690) & Webex

Seminar

Hardware-aware Deep Neural Network Inference Partitioning in Embedded Systems

Fabian Kress
Karlsruhe Institute of Technology

For many years, Deep Neural Networks (DNNs) have been a major research focus in several areas. This is due to the high level of accuracy achieved by these DNNs and the possibility to deploy them in many different use cases.For instance, DNNs are also used in the Belle II experiment to classify particle tracks in its track-trigger system.The deployment of DNNs in embedded systems imposes new challenges, due to additional constraints regarding performance and energy consumption in the near-sensor compute platforms. In such systems, processing data only in the central compute node is disadvantageous since transmitting raw data from sensors such as cameras needs alarge bandwidth and DNN inference of multiple tasks requires certain performance. Hence, offloading DNN workload to the near-sensor nodes in the system can lead to reduced traffic on the interconnect and improved system performance.In this talk, I will present a simulation toolchain for evaluation of hardware-aware DNN inference partitioning for embedded AI applications developed at KIT. This framework explores efficient workload distribution between nearsensor nodes and a central compute node for DNNs. The simulation toolchain thereby evaluates energy and performance metrics for each reasonable partitioning point of a DNN taking specialized hardware accelerators in thenear-sensor node into account.