Yifan Yang
I'm actively looking for a 2025 summer internship (both research and ML engineer) focused on LLM efficiency and general applications. I'm also interested in full-time positions starting in 2026. Feel free to reach out and keep in touch for future opportunities! About meI am a PhD candidate in UCSB computer science department. Prior to UCSB, I received my B.S. in Electronic and Information Engineering from Huazhong University of Science and Technology (HUST). Currently, I'm working on the efficient training/inference of Large Language Models (LLMs), including but not limited to the parameter efficient fine-tuning (PEFT), model compression (weight decomposition and pruning), quantization, zeroth-order optimization and robustness issue during the efficient training. Before summer 2023, I worked on the optimization theory. News04/21/2024: Our paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning' is accepted by EMNLP 2024. 04/21/2024: Our paper 'LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models' is selected as an oral presentation paper (top 5%) by NAACL 2024. 03/28/2024: Our paper 'PID Control-Based Self-Healing to Improve the Robustness of Large Language Models' is accepted by TMLR. 03/13/2024: Our paper 'LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models' is accepted by NAACL 2024. 03/08/2024: I will join Amazon AGI for my summer internship, working on the inference speed-up of LLMs. 08/01/2023: I start working in the field of Natural Language Processing, focusing on the efficient training of LLMs. Industrial ExperienceAmazon AGI, Applied Scientist Intern (Inclined), Pittsburgh, PA, June 2024 - Sep 2024 Working on low-degradation pruning method for inference speed up of large scale LLMsPreprintYifan Yang, Kai Zhen, Denis Filimonov, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Nathan Susanj, Zheng Zhang, Athanasios Mouchtaris, "Wanda++: Pruning Large Language Models via Regional Gradients", under review for NAACL 25, to be released.Sajjad Ghiasvand, Yifan Yang, Zhiyu Xue, Mahnoosh Alizadeh, Zheng Zhang, Ramtin Pedarsani, "Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models", under review for NAACL 25. [arxiv], Yifan Yang, Alec Koppel, Zheng Zhang, "A Gradient-based Approach for Online Robust Deep Neural Network Training with Noisy Labels". [arxiv] Yifan Yang, Chang Liu, Zheng Zhang, "Particle-based Online Bayesian Sampling", submitting to Transactions on Machine Learning Research (TMLR). [arxiv] PublicationsYifan Yang, Kai Zhen, Ershad Banijamal, Athanasios Mouchtaris, Zheng Zhang, "AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning", in Proceedings of 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, USA, 2024. [arxiv]Zhuotong Chen, Zihu Wang, Yifan Yang, Qianxiao Li, Zheng Zhang, "PID Control-Based Self-Healing to Improve the Robustness of Large Language Models", in Transactions on Machine Learning Research (TMLR), 2024. Yifan Yang, Jiajun Zhou, Ngai Wong, Zheng Zhang, "LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models", in Proceedings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024), Oral, top 5%, Mexico City, Mexico, 2024. [arxiv] [code] Yifan Yang, Lin Chen, Pan Zhou, Xiaofeng Ding, "Vflh: A Following-the-Leader-History Based Algorithm for Adaptive Online Convex Optimization with Stochastic Constraints", to appear in Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, Atlanta, USA, 2023. (Best Student Paper Award, top 1%) Yifan Yang, Jie Xu, Zichuan Xu, Pan Zhou and Tie Qiu, "Quantile context-aware social IoT big data recommendation with D2D communication", IEEE Internet of Things Journal 7.6 (2020): 5533-5548. |