Publication*: indicating equal contribution or alphabetic ordering. ![]() Adam-mini: Use Fewer Learning Rates To Gain MoreYushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Yinyu Ye, Ruoyu Sun, Zhi-Quan Luo The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Preserving Diversity in Supervised Fine-tuning of Large Language ModelsZiniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, Zhi-Quan Luo The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and InterventionTianyun Yang, Ziniu Li, Juan Cao, Chang Xu The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Enabling Scalable Oversight via Self-Evolving CriticZhengyang Tang*, Ziniu Li*, Zhenyang Xiao*, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin arXiv:2501.05727 ![]() RealCritic: Towards Effectiveness-Driven Evaluation of Language Model CritiquesZhengyang Tang*, Ziniu Li*, Zhenyang Xiao*, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin arXiv:2501.14492 ![]() Pruning for Robust Concept Erasing in Diffusion ModelsTianyun Yang, Ziniu Li, Juan Cao, Chang Xu NeurIPS Workshop on Safe Generative AI, 2024 ![]() Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order OptimizationHeshen Zhan, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Findings), 2024 ![]() Sensing Jamming Strategy from Limited Observations: An Imitation Learning PerspectiveYoulin Fan, Bo Jiu, Wenqiang Pu, Ziniu Li, Kang Li, Hongwei Liu IEEE Transactions on Signal Processing (TSP) ![]() BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language GenerationChengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu arXiv:2405.17039 ![]() ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsZiniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo The 41st International Conference on Machine Learning (ICML), 2024 ![]() On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching RegularizationJiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su arXiv:2405.16455 ![]() Why Transformers Need Adam: A Hessian PerspectiveYushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 38, 2024 ![]() When is RL better than DPO in RLHF? A Representation and Optimization PerspectiveZiniu Li*, Tian Xu*, Yang Yu The 12th International Conference on Learning Representations (ICLR) (Tiny Paper Track), 2024 ![]() Imitation Learning from Imperfection: Theoretical Justifications and AlgorithmsZiniu Li*, Tian Xu*, Zeyu Qin, Yang Yu, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 37, 2023 ![]() Provably Efficient Adversarial Imitation Learning with Unknown TransitionsTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo The 39th Conference on Uncertainty in Artificial Intelligence (UAI), 2023 ![]() Deploying Offline Reinforcement Learning with Human FeedbackZiniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao arXiv:2303.07046 ![]() Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled AnalysisTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo arXiv:2208.01899 ![]() Rethinking ValueDice: Does It Really Improve Performance?Ziniu Li*, Tian Xu*, Yang Yu, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR) (Blog Track), 2022 ![]() A Note on Target Q-learning for Solving Finite MDPs with A Generative OracleZiniu Li*, Tian Xu*, Yang Yu arXiv:2203.11489 ![]() HyperDQN: A Randomized Exploration Method for Deep Reinforcement LearningZiniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR), 2022 ![]() A Concise Introduction to Imitation Learning (In Chinese)Tian Xu, Ziniu Li, Yang Yu Online Available ![]() Error Bounds of Imitating Policies and Environments for Reinforcement LearningTian Xu, Ziniu Li, Yang Yu IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021 ![]() Error Bounds of Imitating Policies and EnvironmentsTian Xu, Ziniu Li, Yang Yu Conference on Neural Information Processing Systems 34 (NeurIPS), 2020 ![]() Efficient Exploration by Novelty-pursuitZiniu Li*, Xiong-Hui Chen* The 2nd International Conference on Distributed Artificial Intelligence (DAI), 2020 ![]() Self-Guided Evolution Strategies with Historical Estimated GradientsFei-yu Liu, Ziniu Li, Chao Qian The 29th International Conference on Joint Artificial Intelligence (IJCAI), 2020 |