Theses and Dissertations
Date of Award
5-2024
Document Type
Dissertation
Degree Name
Ph.D.
Department
Systems Engineering
Committee Chair
Dr. Na Gong
Advisor(s)
Dr. Mohamed Shaban, Dr. Silas Leavesley, Dr. Kari Lippert
Abstract
Deep Neural Networks (DNN) have become a core component in many state-of-the-art computer systems. It was only in 2005 that Intel released its first commercial CPU to offer dual-core processing, whereas, in 2020, Nvidia released a single GPU with 6912 cores to meet the demand for DNN applications. With demand increasing for computational performance, the underlying computational architecture needs to evolve to keep pace. At the same time, DNN models increase in computational complexity as new algorithms are discovered to expand SL capabilities. DNN algorithms use large datasets, which place a large demand on hardware memory management. Reducing the power consumed in DNN applications has a broad impact on many different systems that perform DNN algorithms, both in the cloud and on the edge.
This research aims to present two approaches to optimize the power efficiency of DNN. From the software design perspective, Aim #1 demonstrates a hyperspectral imaging (HSI) scalable DNN framework for classifying lesional tissue images. Also, Grad-CAM heatmaps have been studied to provide interpretability into the DNN decision making process. These heatmaps visually highlight specific regions of an image that indicate signs of a lesion. Additionally, the DNN architecture scales with image complexity, utilizing Principal Component Analysis (PCA) to reduce the number of dimensions contained in the data and DNN architecture. Utilizing the framework provides a range of possible configurations to optimize between hardware requirements and DNN accuracy.
From the hardware design perspective, Aim #2 presents a bit truncation memory for static random-access memory (SRAM), to support DNN processing. It utilizes the over-precise data structures that contain DNN parameters to trade precision for power efficiency. The developed memory can adapt the number of truncated bits and set the optimal truncation values to meet the quality requirements of different DNN applications and, meanwhile, enable significant power savings. The memory structure was validated using standard AlexNet and VGG-16 models, as well as a pruned lightweight VGG-16 model. The architecture also supports truncation for video streaming applications, making this SRAM architecture useful to process DNN algorithms and video streaming.
Finally, Aim #3 merges the scalable software framework from Aim #1 with the scalable hardware architecture of Aim #2. The two systems are integrated together to produce power optimization at both the hardware and software levels. The effects between the systems are studied to identify optimal software and hardware configurations for the specific HSI classification task.
Recommended Citation
Oswald, William D., "Software and Hardware Co-Design for Deep Learning Power Optimization" (2024). Theses and Dissertations. 183.
https://jagworks.southalabama.edu/theses_diss/183
Included in
Bioimaging and Biomedical Optics Commons, Computer and Systems Architecture Commons, Hardware Systems Commons, Other Biomedical Engineering and Bioengineering Commons, Other Computer Engineering Commons, Other Electrical and Computer Engineering Commons