I am currently a graduate student at UC Merced, very grateful to be working with Prof. Hyeran Jeon. Prior to this, I received my M.S. degree in Software Engineering and my B.E. degree in Computer Science from Shandong University, where I was very fortunate to work with Prof. Lei Ju.
My research interests covers heterogeneous computer architecture and systems, high-performance/parallel computing (CUDA), as well as static and dynamic program analysis and optimization.
💻 Experiences
- 05/2023 - 11/2023, Research Intern, ByteDance, Seattle/San Jose, CA/WA, USA
- Mentor: Ziheng Jiang, Haibin Lin
- Analyzing and optimizing GPU memory wastage and fragmentation in large language model (LLM) training.
- 11/2022 - 02/2023, Part-time internship of software engineer, Uber, Sunnyvale, CA, USA
- Mentor: Milind Chabbi
- Detecting and fixing data races in Uber’s Golang code base.
- 06/2022 - 08/2022, Research Intern, PNNL, Richland, WA, USA
- Mentor: Ang Li
- Detecting floating-point data overflow of GPU-accelerated applications.
📝 Publications
[ISCA’25] Mao Lin, Yuan Feng, Guilherme Cox and Hyeran Jeon. “Forest: Access-aware GPU UVM Management”. The 52nd Annual International Symposium on Computer Architecture (ISCA), June 21–25, 2025, Tokyo, Japan. (To appear)
[EuroMLSys’25] Mao Lin and Hyeran Jeon. The 5th Workshop on Machine Learning and Systems, March 30–April 3, 2025, Rotterdam, Netherlands. “Understanding Oversubscribed Memory Management for Deep Learning Training”. 🌐 💬 ❞
Understanding Oversubscribed Memory Management for Deep Learning Training
@inproceedings{lin2025understanding, author = {Lin, Mao and Jeon, Hyeran}, title = {Understanding Oversubscribed Memory Management for Deep Learning Training}, year = {2025}, isbn = {9798400715389}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3721146.3721955}, doi = {10.1145/3721146.3721955}, booktitle = {Proceedings of the 5th Workshop on Machine Learning and Systems}, pages = {46–55}, numpages = {10}, keywords = {GPU, unified virtual memory, performance analysis, DNN, LLM}, location = {World Trade Center, Rotterdam, Netherlands}, series = {EuroMLSys '25} }
[ASPLOS’23] Mao Lin, Keren Zhou, and Pengfei Su. “DrGPUM: Guiding Memory Optimization for GPU-accelerated Applications”. The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Mar 25-29, 2023, Vancouver, BC, Canada. 🌐 🔗 💬 ❞
DrGPUM: Guiding Memory Optimization for GPU-accelerated Applications
@inproceedings{lin2025drgpum, author = {Lin, Mao and Zhou, Keren and Su, Pengfei}, title = {DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications}, year = {2023}, isbn = {9781450399180}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3582016.3582044}, doi = {10.1145/3582016.3582044}, booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3}, pages = {164–178}, numpages = {15}, keywords = {CUDA, GPU profilers, GPUs, Memory management}, location = {Vancouver, BC, Canada}, series = {ASPLOS 2023} }
[PyTorch Conference’22] Mao Lin, Keren Zhou, and Pengfei Su. “Poster: Squeezing GPU Memory Usage in PyTorch”. Dec. 2022, New Orleans, LA, USA. 🌐
[TCAD’22] Zelin Du, Qianling Zhang, Mao Lin, Shiqing Li, Xin Li, and Lei Ju. “A comprehensive memory management framework for CPU-FPGA heterogenous SoCs”. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022). 🌐 ❞
A Comprehensive Memory Management Framework for CPU-FPGA Heterogeneous SoCs
@ARTICLE{9790045, author={Du, Zelin and Zhang, Qianling and Lin, Mao and Li, Shiqing and Li, Xin and Ju, Lei}, journal={IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems}, title={A Comprehensive Memory Management Framework for CPU-FPGA Heterogenous SoCs}, year={2023}, volume={42}, number={4}, pages={1058-1071}, doi={10.1109/TCAD.2022.3179323}}
💬 Talks
- 03/2025, Present EuroMLSys’25 paper in Rotterdam, Netherlands
- 04/2023, Present DrGPUM at the 1st UC Merced EECS Research Symposium
- 03/2023, Present ASPLOS’23 paper
- 12/2022, Present PyTorch Conference’22 poster in New Orleans, LA, USA
👔 Professional Services
- Artifact Evaluation Committee: PPoPP’23, ASPLOS’24, ISCA’25
- External Reviewer: GPGPU’25
📚 Teaching
-
Teaching Assistant of Computer Architecture (CSE 140)2024 Spring, 2025 Spring
-
Teaching Assistant of Intro to Programming Laboratory Skills/Techniques (CSE 022)2023 Spring
-
Teaching Assistant of Data Structure (CSE 030)2022 Spring
-
Teaching Assistant of Advanced Programming (CSE 024)2021 Fall, 2024 Fall
-
Teaching Assistant of Intro to Object Orient Program (CSE 165)2021 Fall
🧩 Skills
- Languages: C/C++; Python; CUDA; Go; Java; Shell; html; CSS; JSON
- Platforms: Linux; CPU-GPU HMPSoCs; CPU-FPGA HMPSoCs
- Frameworks: PyTorch; TensorFlow; Darknet
- Toolchains: Nsight Systems; Linux perf; GDB; Git; Xilinx Vivado Suite