Publication*: indicating equal contribution or alphabetic ordering. Enabling Scalable Oversight via Self-Evolving CriticZhengyang Tang*, Ziniu Li*, Zhenyang Xiao*, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin arXiv:2501.05727 Pruning for Robust Concept Erasing in Diffusion ModelsTianyun Yang, Ziniu Li, Juan Cao, Chang Xu NeurIPS Workshop on Safe Generative AI, 2024 Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and InterventionTianyun Yang, Ziniu Li, Juan Cao, Chang Xu NeurIPS Workshop on Adaptive Foundation Models, 2024 Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order OptimizationHeshen Zhan, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Findings), 2024 Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better DiversityZiniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, Zhi-Quan Luo arXiv: 2408.16673 Sensing Jamming Strategy from Limited Observations: An Imitation Learning PerspectiveYoulin Fan, Bo Jiu, Wenqiang Pu, Ziniu Li, Kang Li, Hongwei Liu IEEE Transactions on Signal Processing (TSP) Adam-mini: Use Fewer Learning Rates To Gain MoreYushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun arXiv:2406.16793 BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language GenerationChengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu arXiv:2405.17039 ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsZiniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo The 41st International Conference on Machine Learning (ICML), 2024 On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching RegularizationJiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su arXiv:2405.16455 Why Transformers Need Adam: A Hessian PerspectiveYushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 38, 2024 When is RL better than DPO in RLHF? A Representation and Optimization PerspectiveZiniu Li*, Tian Xu*, Yang Yu The 12th International Conference on Learning Representations (ICLR) (Tiny Paper Track), 2024 Imitation Learning from Imperfection: Theoretical Justifications and AlgorithmsZiniu Li*, Tian Xu*, Zeyu Qin, Yang Yu, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 37, 2023 Provably Efficient Adversarial Imitation Learning with Unknown TransitionsTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo The 39th Conference on Uncertainty in Artificial Intelligence (UAI), 2023 Deploying Offline Reinforcement Learning with Human FeedbackZiniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao arXiv:2303.07046 Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled AnalysisTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo arXiv:2208.01899 Rethinking ValueDice: Does It Really Improve Performance?Ziniu Li*, Tian Xu*, Yang Yu, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR) (Blog Track), 2022 A Note on Target Q-learning for Solving Finite MDPs with A Generative OracleZiniu Li*, Tian Xu*, Yang Yu arXiv:2203.11489 HyperDQN: A Randomized Exploration Method for Deep Reinforcement LearningZiniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR), 2022 A Concise Introduction to Imitation Learning (In Chinese)Tian Xu, Ziniu Li, Yang Yu Online Available Error Bounds of Imitating Policies and Environments for Reinforcement LearningTian Xu, Ziniu Li, Yang Yu IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021 Error Bounds of Imitating Policies and EnvironmentsTian Xu, Ziniu Li, Yang Yu Conference on Neural Information Processing Systems 34 (NeurIPS), 2020 Efficient Exploration by Novelty-pursuitZiniu Li*, Xiong-Hui Chen* The 2nd International Conference on Distributed Artificial Intelligence (DAI), 2020 Self-Guided Evolution Strategies with Historical Estimated GradientsFei-yu Liu, Ziniu Li, Chao Qian The 29th International Conference on Joint Artificial Intelligence (IJCAI), 2020 |