Yifan Yang

PhD Candidate
Department of Computer Science
University of California, Santa Barbara (UCSB)
Email: yifanyang at ucsb dot edu
Google Schoolar Linkedin

About Me

I am a PhD candidate in the Computer Science Department at UC Santa Barbara (UCSB). Prior to UCSB, I received my B.E. in Electronic and Information Engineering from Huazhong University of Science and Technology (HUST) in China. During my PhD, I have been fortunate to be funded by Amazon AGI to work on LLM efficiency. I have interned at Amazon AGI and AWS Agentic AI, where I worked on LLM inference speedup and multimodal coding agents (Amazon Q).

Currently, my research interests focus on both text and multimodal LLMs, spanning a wide range of topics, including:

Training and Inference Efficiency through model compression (low-rank decomposition, pruning, quantization)
Multimodal reasoning and AI agent
Parameter-efficient fine-tuning (PEFT)

News

06/23/2025: I start my summer internship at AWS Agentic AI, working on improving the planning ability of multimodal AI agents.

06/11/2025: Finished my PhD proposal presentation. [Slides]

05/15/2025: Two of our papers about LLMs pruning and parameter-efficient federated fine-tuning are accepted to ACL 2025 Findings.

09/21/2024: Our paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning' is accepted by EMNLP 2024.

04/21/2024: Our paper 'LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models' is selected as an oral presentation paper (top 5%) by NAACL 2024.

03/28/2024: Our paper 'PID Control-Based Self-Healing to Improve the Robustness of Large Language Models' is accepted by TMLR.

03/13/2024: Our paper 'LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models' is accepted by NAACL 2024.

03/08/2024: I will join Amazon AGI for my summer internship, working on the inference speed-up of LLMs.

08/01/2023: I start working in the field of Natural Language Processing, focusing on the efficient training of LLMs.

Industrial Experience

AWS Agentic AI, Applied Scientist Intern, Pasadena, CA, June 2025 - Sep 2025

Working on enhancing the planning capabilities of multimodal AI agents.

Amazon AGI, Applied Scientist Intern (Inclined), Pittsburgh, PA, June 2024 - Sep 2024

Working on low-degradation pruning method for inference speed up of large scale LLMs, refer to our Wanda++ paper on ACL 2025 for details.

Preprint

Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan, Jing Liu, Nathan Susanj, Zheng Zhang. "SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes." [arxiv]

Jiayi Tian, Ryan Solgi, Jinming Lu, Yifan Yang, Hai Li, and Zheng Zhang. "FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression." [arxiv]

Zhang, Zhen, Yifan Yang, Kai Zhen, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, and Zheng Zhang. "MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models.", under review for ACL 25. [arxiv]

Zhou, Jiajun, Yifan Yang, Kai Zhen, Ziyue Liu, Yequan Zhao, Ershad Banijamali, Athanasios Mouchtaris, Ngai Wong, and Zheng Zhang. "QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models.", under review for ACL 25. [arxiv]

Yifan Yang, Alec Koppel, Zheng Zhang, "A Gradient-based Approach for Online Robust Deep Neural Network Training with Noisy Labels". [arxiv]

Yifan Yang, Chang Liu, Zheng Zhang, "Particle-based Online Bayesian Sampling", submitting to Transactions on Machine Learning Research (TMLR). [arxiv]

Publications

Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar, "Wanda++: Pruning Large Language Models via Regional Gradients", in 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings), Austria and ICLR Workshop on Sparsity in LLMs, Singapore, 2025. [arxiv] [Project Page] [Unofffical Code]

Sajjad Ghiasvand, Yifan Yang, Zhiyu Xue, Mahnoosh Alizadeh, Zheng Zhang, Ramtin Pedarsani, "Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models", in 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings), Austria. [arxiv],

Yifan Yang, Kai Zhen, Ershad Banijamal, Athanasios Mouchtaris, Zheng Zhang, "AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning", in Proceedings of 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, USA, 2024. [arxiv] [code]

Zhuotong Chen, Zihu Wang, Yifan Yang, Qianxiao Li, Zheng Zhang, "PID Control-Based Self-Healing to Improve the Robustness of Large Language Models", in Transactions on Machine Learning Research (TMLR), 2024.

Yifan Yang, Jiajun Zhou, Ngai Wong, Zheng Zhang, "LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models", in Proceedings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024), Oral, top 5%, Mexico City, Mexico, 2024. [arxiv] [code]

Yifan Yang, Lin Chen, Pan Zhou, Xiaofeng Ding, "Vflh: A Following-the-Leader-History Based Algorithm for Adaptive Online Convex Optimization with Stochastic Constraints", to appear in Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, Atlanta, USA, 2023. (Best Student Paper Award, top 1%)

Yifan Yang, Jie Xu, Zichuan Xu, Pan Zhou and Tie Qiu, "Quantile context-aware social IoT big data recommendation with D2D communication", IEEE Internet of Things Journal 7.6 (2020): 5533-5548.