Shelby Hall Graduate Research Forum Posters

Files

Download Available for download on Monday, June 03, 2030

Download Full Text (1.1 MB)

Description

To effectively function, Artificial intelligence-based applications, computation-intensive data processing, and image processing demand effective memory design. This work introduces truncation capable memory design running in 16-bit or 32-bit systems to facilitate specific bit reduction to improve the power efficiency. We have used power gating to turn the power off for targeted bits for specific applications. We present a novel bit-truncation memory with full truncation flexibility, which can truncate any number of data bits at run time to meet different quality and power savings trade-off requirements for different applications. This design can automatically set the truncated bits to the optimal values for both videos and DNN, thereby optimizing the quality to realize maximum power savings. Proposed hardware is suitable for deep learning inference and image/video processing workloads because its runtime truncation capability maximizes the efficiency vs. quality trade-off. This proposed hardware architecture is also compatible for quantization methods like float16 quantization as it reduces the model size without much changing the computational performance. AI workloads can also be optimized by incorporating truncation and model pruning techniques that eliminate the neural network redundancies without affecting accuracy levels. A number of truncation modes in this memory can offer improved performance on a range of hardware platforms like embedded systems and AI accelerators. As a future work, we can test this hardware and its performance on actual silicon chips to see how practically this chip functions with state-of-the-art quantization and pruning techniques.

Publication Date

3-2025

Department

Systems Engineering

City

Mobile

Disciplines

Computer and Systems Architecture | Digital Circuits | Hardware Systems | Operational Research | Other Operations Research, Systems Engineering and Industrial Engineering | Systems Engineering

Flexible Bit-Truncation Memory for Low-Power Quality-Adaptive Video and Deep Learning Storage

Share

COinS