Yuhan Liu

Ph.D. Candidate, Computer Science Department of Computer Science, The University of Chicago

yuhanlatuchicago·edu

Google Scholar · GitHub · CV

I am a fifth-year Ph.D. student in the Department of Computer Science at the University of Chicago, advised by Prof. Junchen Jiang and Prof. Shan Lu. My research builds efficient inference systems for large language models — in particular, I've built the first compression and streaming system for KV cache that's designed to reduce its inline network transmission latency -- CacheGen, and the first translation system for KV cache between two different LLMs -- DroidSpeak.

I received my B.S. in Computer Science from the University of Wisconsin–Madison, where I was fortunate to be advised by Prof. Shivaram Venkataraman. In Summer 2024 I was a research intern at Microsoft Research, mentored by Madan Musuvathi and Esha Choukse.

Publications

* denotes equal contribution. Full list available on Google Scholar.

2026

DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving Yuhan Liu, Yuyang Huang, Jiayi Yao, Shaoting Feng, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse NSDI 2026 [Paper]

2025

CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang ACM EuroSys 2025 — Best Paper Award [Paper]
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference Yuhan Liu*, Yihua Cheng*, Jiayi Yao*, Yuwei An, Xiaokun Chen, Shaoting Feng, Yuyang Huang, Samuel Shen, Kuntai Du, Junchen Jiang Preprint [Paper] [Code]

2024

CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang ACM SIGCOMM 2024 [Paper]
ChameleonAPI: Automatic and Efficient Customization of Neural Networks for ML Applications Yuhan Liu, Chengcheng Wan, Kuntai Du, Henry Hoffmann, Junchen Jiang, Shan Lu, Michael Maire OSDI 2024 [Paper]
GRACE: Loss-Resilient Real-Time Video through Neural Codecs Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Kuntai Du, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang NSDI 2024
Keeper: Automated Testing and Fixing of Machine Learning Software Chengcheng Wan, Shicheng Liu, Sophie Xie, Yuhan Liu, Henry Hoffmann, Michael Maire, Shan Lu ACM TOSEM 2024

2023

OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation Kuntai Du, Yuhan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, Yuyang Huang, Ganesh Ananthanarayanan, Junchen Jiang ACM SoCC 2023
Run-Time Prevention of Software Integration Failures of Machine Learning APIs Chengcheng Wan, Yuhan Liu, Kuntai Du, Henry Hoffmann, Junchen Jiang, Michael Maire, Shan Lu ACM OOPSLA 2023

Open-Source Projects

KV Cache Layer

LMCache

The first open-source Knowledge Delivery Network for LLM applications. Accelerates inference up to 8× at 8× lower cost.

Serving Stack

vLLM Production Stack

Scale from a single vLLM instance to a distributed deployment without changing a line of application code.

Invited Talks & Awards

Invited Talks

LLM Systems Seminar, Northeastern UniversityOct 2025
Amazon RufusAug 2025
Efficient AI Seminar, Rutgers UniversityMay 2025
LLM Systems Class, Carnegie Mellon UniversityApr 2025
Systems Group, University of MarylandMar 2025
ML Systems Group, UC San DiegoMar 2025
Systems Group, Duke UniversityNov 2024
Distributed Systems Lab, U. PennsylvaniaNov 2024

Honors & Awards

EECS Rising Star2025
ACM EuroSys Best Paper Award2025
UU Fellowship, UChicago2023
Neubauer Graduate Scholarship, UChicago2021
CRA Outstanding Undergraduate Researcher (Hon. Mention)2021
Trewartha Honors Senior Thesis Award2020

Teaching & Service

Teaching

TA · Graduate Networking (CMSC 33300)Autumn 2024
TA · Intro to Computer Systems (CMSC 15400)Winter 2022
Peer Mentor · Intro to DB Systems (CS 564, UW-Madison)Fall 2020

Mentoring

Hanchen Li → PhD, UC Berkeley2023 – 2025
Zhuohan Gu → PhD, MIT2024 – 2025
Shaoting Feng → PhD, University of Washington2024 –

Service

Organizer · SIGCOMM '25 Tutorial: Networking for Stateful LLM Inference2025
Co-Chair · Graduate Women in CS, UChicago2024 – 2025
Reviewer · NeurIPS, ICML2022
Reviewer · ICML2025

Industry

Research Intern, Microsoft ResearchSummer 2024