I am Shuzheng Si (εΈδΉ¦ζ£ in Chinese βπ»), a second-year Ph.D. student in the Department of Computer Science and Technology at Tsinghua University. I am lucky to be advised by Prof. Maosong Sun and affiliated with TsinghuaNLP Lab.
Now, my research interests lie in Natural Language Processing (NLP) and Large Language Models (LLMs), specifically focusing on Data-centric Methods and Data Science for NLP, including Data Selection, Data Synthesis, and Learning from Noisy Data, etc. My long-term research goal is to elucidate the influence of data on LLMs and subsequently utilize these insights to effectively guide the organization, selection, and synthesis of high-quality data, thereby enhancing the foundational capabilities of LLMs (e.g., instruction following, factuality, and faithfulness). Find my up-to-date publication list in π Google Scholar.
Feel free to drop an email if you are interested in connecting π§π»βπ€βπ§π».