Back to Home

All Publications

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Mao Lin, Xi Wang, Guilherme Cox, Dong Li, and Hyeran Jeon

arXiv preprint, April 2026 (arXiv 2026)

PASTA: A Modular Program Analysis Tool Framework for Accelerators

Mao Lin, Hyeran Jeon, and Keren Zhou

The 23rd ACM/IEEE International Symposium on Code Generation and Optimization, January 31–February 4, 2026, Sydney, Australia (CGO '26)

Forest: Access-aware GPU UVM Management

Mao Lin, Yuan Feng, Guilherme Cox, and Hyeran Jeon

The 52nd Annual International Symposium on Computer Architecture, June 21–25, 2025, Tokyo, Japan (ISCA '25)

Understanding Oversubscribed Memory Management for Deep Learning Training

Mao Lin and Hyeran Jeon

The 5th Workshop on Machine Learning and Systems, March 30–April 3, 2025, Rotterdam, Netherlands (EuroMLSys '25)

DrGPUM: Guiding Memory Optimization for GPU-accelerated Applications

Mao Lin, Keren Zhou, and Pengfei Su

The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Mar 25-29, 2023, Vancouver, BC, Canada (ASPLOS '23)

A Comprehensive Memory Management Framework for CPU-FPGA Heterogenous SoCs

Zelin Du, Qianling Zhang, Mao Lin, Shiqing Li, Xin Li, and Lei Ju

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022 (TCAD '22)

Poster: Squeezing GPU Memory Usage in PyTorch

Mao Lin, Keren Zhou, and Pengfei Su

Dec. 2022, New Orleans, LA, USA (PyTorch Conference '22)