Hi! I am a third-year PhD student at Language Technologies Institute (LTI), Carnegie Mellon University (CMU), advised by Prof. Chenyan Xiong. I was fortunate to work with Dr. Scott Yih during my internship at Meta (2024). My primary research interests are:
  • Intelligent and efficient LLM scaling with novel pretraining data curation and synthesis methods.
  • Data valuation and influence attribution to better capture the impact of LLM training data.
Previously, I graduated from Tsinghua University in 2023 with a honours degree in Computer Science and Technology. I was honored to be a member of THUNLP, advised by Prof. Zhiyuan Liu, working closely with Dr. Tianyu Gao and Dr. Zhengyan Zhang in efficient few-shot learning.

When I am not doing research, I like to work out, play guitar, and watch movies.

Updates: