About me

I’m a fourth-year PhD student in the Department of CS at University of Chicago, advised by Prof. Junchen Jiang and Prof. Shan Lu. My research interest is building efficient inference systems for LLMs. I received my B.S. in CS at University of Wisconsin-Madison, fortunate to be advised by Prof. Shivaram Venkataraman.

I’m currently working on the following open-source projects:

  • LMCache: The first open-source Knowledge Delivery Network (KDN) that accelerates LLM applications up to 8x faster, at 8x lower cost.
  • vLLM production stack: Scale from single vLLM instance to distributed vLLM deployment without changing any application code.

Publications

  • DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving paper
    Yuhan Liu, Yuyang Huang, Jiayi Yao, Shaoting Feng, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse
    NSDI 2026
  • CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving paper
    Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang
    SIGCOMM 2024
  • CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion paper
    Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang
    EuroSys 2025 (Best Paper)
  • ChameleonAPI: Automatic and Efficient Customization of Neural Networks for ML Applications
    Yuhan Liu, Chengcheng Wan, Kuntai Du, Henry Hoffmann, Junchen Jiang, Shan Lu, Michael Maire
    OSDI 2024
  • GRACE: Loss-Resilient Real-Time Video through Neural Codecs
    Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Kuntai Du, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang
    NSDI 2024
  • Keeper: Automated Testing and Fixing of Machine Learning Software
    Chengcheng Wan, Shicheng Liu, Sophie Xie, Yuhan Liu, Henry Hoffmann, Michael Maire, Shan Lu
    TOSEM 2024
  • OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
    Kuntai Du, Yuhan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, Yuyang Huang, Ganesh Ananthanarayanan, Junchen Jiang
    SoCC 2023
  • Run-Time Prevention of Software Integration Failures of Machine Learning APIs
    Chengcheng Wan, Yuhan Liu, Kuntai Du, Henry Hoffmann, Junchen Jiang, Michael Maire, Shan Lu
    OOPSLA 2023

Workshops

  • Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network paper
    Hanchen Li, Yuhan Liu, Yihua Cheng, Siddhant Ray, Kuntai Du, Junchen Jiang
    SIGCOMM Workshop on Networks for AI Computing (NAIC)

Preprints

  • AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning paper code
    Yuhan Liu, Saurabh Agarwal, Shivaram Venkataraman
  • Accelerating deep learning inference via learned caches paper
    Arjun Balasubramanian, Adarsh Kumar, Yuhan Liu, Han Cao, Shivaram Venkataraman, Aditya Akella

Invited Talks

  • Distributed Systems Lab @ University of Pennsylvania, Nov. 2024
  • System Group @ Duke University, Nov. 2024
  • ML System Group @ UCSD, March 2025
  • System Group @ University of Maryland, March 2025
  • Guest Lecture at Large Language Model Systems Class @ CMU, April 2025
  • Efficient AI Seminar @ Rutgers University, May 2025

Teaching

  • Graduate Networking (CMSC 33300), Teaching Assistant, Autumn 2024
  • Intro to computer systems (CMSC 15400), Teaching Assistant, Winter 2022
  • Intro to database systems (CS 564), Peer mentor, Fall 2020 (At Madison)

Awards

  • EuroSys Best Paper Award (2025)
  • UU Fellowship (2023): University of Chicago fellowship
  • Neubauer Graduate Scholarship (2021): University of Chicago fellowship
  • Computing Research Association Outstanding Undergraduate Researcher Awards (2021): Honorable Mention
  • Trewartha Honors Senior Thesis award (2020): research grant for senior students carrying out thesis research with honor in CS.

Work Experience

  • Microsoft Research, Summer 2024
    Research Intern
    Mentors: Madan Musuvathi, Esha Choukse, Shan Lu

Mentored Students

  • Hanchen Li, University of Chicago, June 2023 – Present, to PhD at University of California, Berkeley
  • Zhuohan Gu, University of Chicago, Jan. 2024 – Present, to PhD at Massachusetts Institute of Technology
  • Shaoting Feng, University of Chicago, Sept. 2024 – Present
  • Ashton Tang, University of Chicago, June 2024 - September 2024

Service

  • Co-Chair of Graduate Women in Computer Science (GWiCS) at University of Chicago, Fall 2024-Summer 2025
  • Reviewer for NeurIPS 2022, ICML 2022

Contact

yuhanl[at]uchicago.edu