About Me
I am a 1st year Ph.D. student in CoAI Group, Dept. of Computer Science and Technology of Tsinghua University. I’m advised by Prof. Minlie Huang. My research interests lie in LLM safety and trustworthy, and I’m recently working on the mechanism of safety alignment, hallucination and knowledge boundary of LRMs.
News
🎉 Our Papers
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM SafetyandHow Should We Enhance the Safety of Large Reasoning Models: An Empirical Studyare accepted by ACL 2026.🎉 Our Papers
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMsandBe Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!are accepted by ICLR 2026.🎉 We built Open Cowork - Opensource Claude Cowork for Windows & macOS. code
Working Experiences
- Research intern at A*STAR’s Centre for Frontier AI Research (CFAR), from Feb 2025 to May 2025, under the supervision of Prof. Yew-Soon Ong.
Publications
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Open Source Projects
Teaching
I was a TA for the following undergraduate courses:
- Artificial Neural Network (2024 Fall, 2025 Fall)
- Linear Algebra (2024 Fall)
Honors and Awards
- Excellent Graduate, Tsinghua University, 2025
- 3rd Prize Winner of the Global Challenge for Safe and Secure LLMs (Track 1)
- Academic Excellence in Research Award of Tsinghua University, 2023.09-2024.09
- Meritorious Winner of Mathematical Contest In Modeling Certificate of Achievement, 2023
- Comprehensive Scholarship of Tsinghua University, 2022.09-2023.09
- Comprehensive Scholarship of Tsinghua University, 2021.09-2022.09
Educations
- 2025.09-now, Tsinghua University, Beijing, China. Ph.D. Student.
- 2021.09-2025.06, Tsinghua University, Beijing, China. Undergraduate Student.
- 2018.09-2021.06, Urumqi No.1 Senior High School, Xinjiang, China. High school Student.
