Date of Award

5-2024

Document Type

Dissertation

Degree Name

Ph.D.

Department

Systems Engineering

Committee Chair

Dr. Na Gong

Advisor(s)

Dr. Mohamed Shaban, Dr. Silas Leavesley, Dr. Kari Lippert

Abstract

Deep Neural Networks (DNN) have become a core component in many state-of-the-art computer systems. It was only in 2005 that Intel released its first commercial CPU to offer dual-core processing, whereas, in 2020, Nvidia released a single GPU with 6912 cores to meet the demand for DNN applications. With demand increasing for computational performance, the underlying computational architecture needs to evolve to keep pace. At the same time, DNN models increase in computational complexity as new algorithms are discovered to expand SL capabilities. DNN algorithms use large datasets, which place a large demand on hardware memory management. Reducing the power consumed in DNN applications has a broad impact on many different systems that perform DNN algorithms, both in the cloud and on the edge.

This research aims to present two approaches to optimize the power efficiency of DNN. From the software design perspective, Aim #1 demonstrates a hyperspectral imaging (HSI) scalable DNN framework for classifying lesional tissue images. Also, Grad-CAM heatmaps have been studied to provide interpretability into the DNN decision making process. These heatmaps visually highlight specific regions of an image that indicate signs of a lesion. Additionally, the DNN architecture scales with image complexity, utilizing Principal Component Analysis (PCA) to reduce the number of dimensions contained in the data and DNN architecture. Utilizing the framework provides a range of possible configurations to optimize between hardware requirements and DNN accuracy.

From the hardware design perspective, Aim #2 presents a bit truncation memory for static random-access memory (SRAM), to support DNN processing. It utilizes the over-precise data structures that contain DNN parameters to trade precision for power efficiency. The developed memory can adapt the number of truncated bits and set the optimal truncation values to meet the quality requirements of different DNN applications and, meanwhile, enable significant power savings. The memory structure was validated using standard AlexNet and VGG-16 models, as well as a pruned lightweight VGG-16 model. The architecture also supports truncation for video streaming applications, making this SRAM architecture useful to process DNN algorithms and video streaming.

Finally, Aim #3 merges the scalable software framework from Aim #1 with the scalable hardware architecture of Aim #2. The two systems are integrated together to produce power optimization at both the hardware and software levels. The effects between the systems are studied to identify optimal software and hardware configurations for the specific HSI classification task.

Recommended Citation

Oswald, William D., "Software and Hardware Co-Design for Deep Learning Power Optimization" (2024). Graduate Theses and Dissertations (2019 - present). 183.
https://jagworks.southalabama.edu/theses_diss/183

Download

Available for download on Friday, April 07, 2028

Included in

Bioimaging and Biomedical Optics Commons, Computer and Systems Architecture Commons, Hardware Systems Commons, Other Biomedical Engineering and Bioengineering Commons, Other Computer Engineering Commons, Other Electrical and Computer Engineering Commons

COinS

Graduate Theses and Dissertations (2019 - present)

Software and Hardware Co-Design for Deep Learning Power Optimization

Date of Award

Document Type

Degree Name

Department

Committee Chair

Advisor(s)

Abstract

Recommended Citation

Included in

Browse

Search

Author Corner

Links

Graduate Theses and Dissertations (2019 - present)

Software and Hardware Co-Design for Deep Learning Power Optimization

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair

Advisor(s)

Abstract

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links