Towards Robust Deep Learning on GPUs
This is a project home page of NSF Collaborative Research: SHF: Small:: Torwards Robust Deep Learning Computing on GPUs (CU Boulder site, WVU site)
Graduate Students
Mujahid Al Rafi (UC Merced)
Yuan Feng (UC Merced)
Ange Thierry Ishimwe (CU Boulder)
Banafsheh Adami (WVU)
Ehsan Bahaloo Horeh (WVU)
Undergraduate Students
Aishwaria Rangasamy (UC Merced) - graduated Sp'23
Xavier Ybarra (UC Merced) - graduated Sp'23
Alexander Juenemann (CU Boulder)
Research Scientist
Ryan Zalek (NVIDIA)
Goals and Achievements
Graphics processing units (GPU) have become one of the most promising computing engines in many application domains such as scientific simulations and deep learning. With the massive parallel processing power provided by GPUs, most of the state-of-the-art server and edge systems employ GPUs as the core computing engines for deep-learning model training and inference. As the performance of deep learning models becomes one of the most important delimiters that determines market revenue of the model creators and the convenience of daily lives of model consumers, it is critical to enforce reliable and robust deep-learning computation. This project aims to explore the challenges and opportunities to address the reliability and privacy implications of GPU computing as a deep-learning accelerator and design lightweight protection schemes.
The technical aims of this project are divided into three thrusts.
1) Exploration of vulnerabilities and their impact on GPU-based deep-learning computing.
Mujahid Al Rafi, Yuan Feng, Fan Yao, Meng Tang, and Hyeran Jeon, "Decepticon: Understanding Vulnerabilities of Transformers, IEEE International Symposium on Workload Characterization (IISWC), Ghent, Belgium, Oct 2023
Luanzheng Guo, Jay Lofstead, Jie Ren, Ignacio Laguna, Gokcen Kestor, Line Pouchard, Dossay Oryspayev, and Hyeran Jeon, "Understanding System Resilience for Converged Computing of Cloud, Edge, and HPC," Workshop on Converged Computing to be co-located with ISC'23 (WOCC), Hamburg, Germany, May 2023
Yuan Feng and Hyeran Jeon, "Understanding Scalability of Multi-GPU Systems," ACM Workshop on General Purpose GPUs (GPGPU), Montreal, Canada, Feb 2023
Mujahid Al Rafi, Yuan Feng, and Hyeran Jeon, "Too Noisy To Extract: Pitfalls of Model Extraction Attacks," Workshop on Negative results, Opportunities, Perspectives, and Experiences In conjunction with ASPLOS-27 (NOPE), Feb 2022
Mujahid Al Rafi, Yuan Feng, and Hyeran Jeon, "Revealing Secrets From Pre-trained Models," arXiv Preprint, July 2022
2) Tackling the vulnerabilities at the compute-unit level by redesigning GPU building blocks.
Mujahid Al Rafi and Hyeran Jeon, "Enabling robust GPU computing by exploiting resource underutilization" - work in progress
Ryan Zalek, Mujahid Al Rafi, and Hyeran Jeon, "Memory access obfuscation on GPUs" - work in progress
3) Designing selective integrity protection mechanisms without imposing significant performance overhead.
Ange Thierry Ishimwe and Tamara Silbergleit Lehman, “SMAD: Efficiently Defending Against Transient Execution Attacks,” Poster presentation at Young Architect (YArch) 2023
Alexander Juenemann, Tamara Silbergleit Lehman, “GPU Rowhammer Impact on Deep Learning Models,” Poster presentation at Workshop on Hardware and Architectural Support for Security and Privacy (HASP) 2023.
Ange Ishimwe, Phaedra Curlin, Alexander Juenemann and Tamara Silbergleit Lehman, “SMAD: Efficiently Defending Against Transient Execution Attacks,” (under review) International Symposium on Hardware Oriented Security and Trust (HOST) 2024
Alexander Juenemann, Tamara Silbergleit Lehman, “Investigating Impact of Rowhammer Attacks on Deep Learning Models using GPU Simulators,” (work in progress, to be submitted for publication soon)
Educational Activities
UC Merced
A new course EECS242 (Advanced Topics in Computer Architecture) was created and the state-of-the-art research on "Security and Reliability" and "GPU and Accelerators" were covered in Fall 2022
Topics of "Security and Reliability" and "GPU and Accelerators" are newly added in EECS253 (Computer Architecture and Design) in Fall 2021
With supplemental REU support, two undergrad student interns worked on secure GPU computing for large language models in Summer 2022. The interns reviewed various literature studies, examined the weight value differences between pre-trained and fine-tuned models of several Transformer models, and developed a software tool that verifies our proposed ideas.
A high-school summer intern was trained in Summer 2023. The intern studied GPU architecture and programming model, tested several GPU applications, and analyzed the performance differences between CPU and GPU computing.
University of Colorado Boulder
An REU student and a PhD student have been recruited. The students are exploring impact of rowhammer attack on DNN computing.
"Secure Deep Learning Computing on GPUs - Analysis on Rowhammer Implementation," presented at CU Boulder SPUR Final Presentation Workshop.
West Virginia University
A new course CPE 593B (Hardware Security and Trust) has been offered for graduate students at WVU in spring 2023. The curse covered state of art hardware based attacks (side channel) on GPU and deep learning.