Mao Lin (林茂)

Graduate Student at UC Merced

Researching ML/LLM systems, heterogeneous computer architectures, high-performance computing, and program analysis and optimization.

[email protected] Merced, CA, USA UC Merced linmao.org github.com/Lin-Mao Google Scholar LinkedIn

Education

University of California, Merced 08/2021 – Present

Ph.D. in Electrical Engineering and Computer Science · Merced, CA, USA

Shandong University 09/2018 – 06/2021

Master of Software Engineering · Jinan, China

Shandong University 09/2014 – 06/2018

Bachelor of Computer Science · Jinan, China

Research Areas

GPU Architectures and Systems LLM Training/Inference Systems Machine Learning Systems Static and Dynamic Performance Analysis

Experience

Samsung — Research Intern 05/2026 – 08/2026

San Jose, CA, USA

Working on hardware/software co-design for MoE models on Samsung's AI accelerator.

ByteDance — Research Intern 05/2023 – 11/2023

Seattle/San Jose, WA/CA, USA

Optimized PyTorch memory management for distributed LLM training, reducing memory usage by 10% to 30% on models including GPT-2 and Whisper.

Uber — Software Engineer Intern 11/2022 – 02/2023

Sunnyvale, CA, USA

Analyzed production Go services and fixed more than 50 data race issues.

PNNL — Research Intern 06/2022 – 08/2022

Richland, WA, USA

Built GPU profiling and floating-point analysis tooling that found critical overflow issues in DOE applications.

Open Source Software

AccelProf github.com/AccelProf

A profiling and analysis framework for various accelerator applications.

DrGPUM github.com/Lin-Mao/DrGPUM

Tooling for guiding memory optimization in GPU-accelerated applications.

Publications

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Mao Lin, Xi Wang, Guilherme Cox, Dong Li, and Hyeran Jeon

arXiv preprint, April 2026 (arXiv 2026)

PASTA: A Modular Program Analysis Tool Framework for Accelerators

Mao Lin, Hyeran Jeon, and Keren Zhou

The 23rd ACM/IEEE International Symposium on Code Generation and Optimization, Jan 31–Feb 4, 2026, Sydney, Australia (CGO '26)

Forest: Access-aware GPU UVM Management

Mao Lin, Yuan Feng, Guilherme Cox, and Hyeran Jeon

The 52nd Annual International Symposium on Computer Architecture, Jun 21–25, 2025, Tokyo, Japan (ISCA '25)

Understanding Oversubscribed Memory Management for Deep Learning Training

Mao Lin and Hyeran Jeon

The 5th Workshop on Machine Learning and Systems, Mar 30–Apr 3, 2025, Rotterdam, Netherlands (EuroMLSys '25)

DrGPUM: Guiding Memory Optimization for GPU-accelerated Applications

Mao Lin, Keren Zhou, and Pengfei Su

The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Mar 25–29, 2023, Vancouver, BC, Canada (ASPLOS '23)

A Comprehensive Memory Management Framework for CPU-FPGA Heterogenous SoCs

Zelin Du, Qianling Zhang, Mao Lin, Shiqing Li, Xin Li, and Lei Ju

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022 (TCAD '22)

Poster: Squeezing GPU Memory Usage in PyTorch

Mao Lin, Keren Zhou, and Pengfei Su

PyTorch Conference '22, Dec 2022, New Orleans, LA, USA (PyTorch Conference '22)

Talks & Presentations

Forest: Access-aware GPU UVM Management 06/2025

ISCA '25 · Tokyo, Japan

Understanding Oversubscribed Memory Management for Deep Learning Training 03/2025

EuroMLSys '25 · Rotterdam, Netherlands

DrGPUM: Guiding Memory Optimization for GPU-accelerated Applications 03/2023

ASPLOS '23 · Remote

Squeezing GPU Memory Usage in PyTorch 12/2022

Poster · PyTorch Conference '22 · New Orleans, LA, USA

Professional Services

Artifact Evaluation Committee

PPoPP '23 ASPLOS '24 ISCA '25 SOSP '25 IISWC '25 EuroSys '26 ISPASS '26 MLSys '26 ISCA '26

Reviewer

GPGPU '25 ICCAD '26

Teaching Experience

Computer Architecture and Design (EECS 240) 2025 Fall

Guest Lecturer

Computer Architecture (CSE 140) 2024 Spring, 2025 Spring

Teaching Assistant

Intro to Programming Laboratory Skills/Techniques (CSE 022) 2023 Spring

Teaching Assistant

Data Structure (CSE 030) 2022 Spring

Teaching Assistant

Advanced Programming (CSE 024) 2021 Fall, 2024 Fall

Teaching Assistant

Intro to Object Orient Program (CSE 165) 2021 Fall

Teaching Assistant

Technical Skills

Programming Languages

C/C++ Python CUDA Go Java Shell

Platforms & Systems

Linux/Windows/MacOS CPU-GPU HMPSoCs CPU-FPGA HMPSoCs

Frameworks & Libraries

vLLM PyTorch TensorFlow

Development Tools

Nsight Systems Nsight Compute Linux perf GDB Git CMake Xilinx Vivado Suite