Research
My research interests lie in machine learning and computer systems,
with the goal of enabling the applications of advanced machine learning in various scenarios.
Currently, I am working on the efficient inference of language models.
|
|
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Zhuoming Chen,
Avner May,
Ruslan Svirschevski,
Yuhsun Huang,
Max Ryabinin,
Zhihao Jia,
Beidi Chen
Scalable tree-based speculative decoding.
|
|
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification
Xupeng Miao,
Gabriele Oliaro,
Zhihao Zhang,
Xinhao Cheng,
Zeyu Wang,
Zhengxin Zhang,
Rae Ying Yee Wong,
Alan Zhu,
Lijie Yang,
Xiaoxiang Shi,
Chunan Shi,
Zhuoming Chen,
Daiyaan Arfeen,
Reyna Abhyankar,
Zhihao Jia
Supports large language models serving with speculative decoding.
|
|
GNNPipe: Scaling Deep GNN Training with Pipelined Model Parallelism
Jingji Chen,
Zhuoming Chen,
Xuehai Qian,
Accelerates GNN training with pipeline model parallslism by reducing communication overhead and improving GPU utlization.
|
|
Quantized Training of Gradient Boosting Decision Trees
Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Accelerates GBDT training throw low-bit quantization (implemented in LightGBM).
|
|