Publications

(2024). Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update. In 12th International Conference on Learning Representations (ICLR 2024)(spotlight).

PDF Cite Project

(2024). Query-Policy Misalignment in Preference-Based Reinforcement Learning. In 12th International Conference on Learning Representations (ICLR 2024)(spotlight).

PDF Cite Project

(2024). OpenChat: Advancing Open-source Language Models with Mixed-Quality Data. In 12th International Conference on Learning Representations (ICLR 2024).

PDF Cite Code

(2023). Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization. International Conference on Autonomous Agents and Multiagent Systems 2023 (AAMAS 2023).

PDF Cite Project

(2022). Model-Based Offline Planning with Trajectory Pruning. International Joint Conference on Artificial Intelligence.

PDF Cite Code Project

(2021). Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI2022).

PDF Cite Project