Publication

*: indicating equal contribution or alphabetic ordering.

Google Scholar.

2024

Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo
Conference on Neural Information Processing System (NeurIPS) 38, 2024

Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization
Hesehn Zhan, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun
The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Findings), 2024

Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity
Ziniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, Zhi-Quan Luo
arXiv: 2408.16673

Sensing Jamming Strategy from Limited Observations: An Imitation Learning Perspective
Youlin Fan, Bo Jiu, Wenqiang Pu, Ziniu Li, Kang Li, Hongwei Liu
IEEE Transactions on Signal Processing

Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun
arXiv:2406.16793

BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation
Chengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu
arXiv:2405.17039

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
Jiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su
arXiv:2405.16455

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo
The 41st International Conference on Machine Learning (ICML), 2024
(The early version of this work is at arXiv:2310.10505)

When is RL better than DPO in RLHF? A Representation and Optimization Perspective
Ziniu Li*, Tian Xu*, Yang Yu
The 12th International Conference on Learning Representations (ICLR) (Tiny Paper Track), 2024
(This paper is selected as an oral presentation, with an early version at arXiv:2312.10584)

2023

Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Ziniu Li* , Tian Xu*, Zeyu Qin, Yang Yu, Zhi-Quan Luo
Conference onNeural Information Processing System (NeurIPS) 37, 2023
(This paper is selected as an spotlight presentation, with an early version at arXiv:2301.11687)

Provably Efficient Adversarial Imitation Learning with Unknown Transitions
Tian Xu*, Ziniu Li* , Yang Yu, Zhi-Quan Luo
The 39th Conference on Uncertainty in Artificial Intelligence (UAI), 2023
(This paper is selected as an oral presentation, with an early version at arXiv:2106.10424v2)

Deploying Offline Reinforcement Learning with Human Feedback
Ziniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao
arXiv:2303.07046

2022

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
Tian Xu*, Ziniu Li* , Yang Yu, Zhi-Quan Luo
arXiv:2208.01899
(The early version of this work is at arXiv:2106.10424v3)

Rethinking ValueDice: Does It Really Improve Performance?
Ziniu Li* , Tian Xu*, Yang Yu, Zhi-Quan Luo
The 10th International Conference on Learning Representations (ICLR) (Blog Track), 2022

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo
The 10th International Conference on Learning Representations (ICLR), 2022
(This work is selected as an oral presentation at Workshop on Ecological Theory of Reinforcement Learning at NeurIPS, 2021)

2021

A Concise Introduction to Imitation Learning (In Chinese)
Tian Xu, Ziniu Li, Yang Yu
Online Available

Error Bounds of Imitating Policies and Environments for Reinforcement Learning
Tian Xu, Ziniu Li, Yang Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

2020

Error Bounds of Imitating Policies and Environments
Tian Xu, Ziniu Li, Yang Yu
Conference on Neural Information Processing Systems 34 (NeurIPS), 2020.

Efficient Exploration by Novelty-pursuit
Ziniu Li*, Xiong-Hui Chen*
The 2nd International Conference on Distributed Artificial Intelligence (DAI), 2020

Self-Guided Evolution Strategies with Historical Estimated Gradients
Fei-yu Liu, Ziniu Li, Chao Qian
The 29th International Conference on Joint Artificial Intelligence (IJCAI), 2020

Solving The Inverse Design Problem of Electrical Fuse with Machine Learning
Xinjian Huang, Ziniu Li, Zhiyuan Liu, Bin Xiang, Yingsan Geng, Jianhua Wang
IEEE Access, 8, 74137-74144, 2020

2019

On Value Discrepancy of Imitation Learning
Tian Xu, Ziniu Li, Yang Yu
arXiv:1911.07027