AIR-DREAM Lab
AIR-DREAM Lab
Home
News
Researches
Publications
People
Light
Dark
Automatic
Algorithms
Data-Driven Decision-Making Algorithms
Developing high-performance, robust, generalizable, and deployable data-driven decision-making algorithms for real-world problems.
Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL
Offline reinforcement learning (RL) offers an appealing approach to real-world tasks by learning policies from pre-collected datasets …
Peng Cheng
,
Xianyuan Zhan
,
Zhihao Wu
,
Wenjia Zhang
,
Shoucheng Song
,
Han Wang
,
Youfang Lin
,
Li Jiang
PDF
Cite
Project
Project
Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization
Offline reinforcement learning (RL) has received considerable attention in recent years due to its attractive capability of learning …
Xiangsen Wang
,
Haoran Xu
,
Yinan Zheng
,
Xianyuan Zhan
PDF
Cite
Project
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Offline-to-online reinforcement learning (RL), by combining the benefits of offline pretraining and online finetuning, promises …
Jianxiong Li
,
Xiao Hu
,
Haoran Xu
,
Jingjing Liu
,
Xianyuan Zhan
,
Ya-Qin Zhang
PDF
Cite
Project
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Based on the IVR framework, we further propose two practical algorithms, Sparse Q-learning (SQL) and Exponential Q-learning (EQL), which adopt the same value regularization used in existing works, but in a complete in-sample manner.
Haoran Xu
,
Li Jiang
,
Jianxiong Li
,
Zhuoran Yang
,
Zhaoran Wang
,
Victor Wai Kin Chan
,
Xianyuan Zhan
PDF
Cite
Code
Project
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
DOGE marries dataset geometry with deep function approximators in offline RL, and enables exploitation in generalizable OOD areas rather than strictly constraining policy within data distribution.
Jianxiong Li
,
Xianyuan Zhan
,
Haoran Xu
,
Xiangyu Zhu
,
Jingjing Liu
,
Ya-Qin Zhang
PDF
Cite
Code
Project
An Efficient Multi-Agent Optimization Approach for Coordinated Massive MIMO Beamforming
Beamforming plays an important role in 5G Massive Multiple-Input Multiple-Output (MMIMO) communications. Optimizing beamforming …
Li Jiang
,
Xiangsen Wang
,
Aidong Yang
,
Xidong Wang
,
Xiaojia Jin
,
Wei Wang
,
Xiaozhou Ye
,
Ye Ouyang
,
Xianyuan Zhan
PDF
Cite
Project
Project
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
This paper proposes an offline policy optimization approach for imperfect rewards. Abstract: Reward function is essential in …
Jianxiong Li
,
Xiao Hu
,
Haoran Xu
,
Jingjing Liu
,
Xianyuan Zhan
,
Qing-Shan Jia
,
Ya-Qin Zhang
PDF
Cite
Project
Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization
Offline reinforcement learning (RL) that learns policies from offline datasets without environment interaction has received …
Xiangsen Wang
,
Xianyuan Zhan
PDF
Cite
Project
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
H2O introduces a dynamics-aware policy evaluation scheme, which adaptively penalizes the Q function learning on simulated state-action pairs with large dynamics gaps, while also simultaneously allowing learning from a fixed real-world dataset.
Haoyi Niu
,
Shubham Sharma
,
Yiwen Qiu
,
Ming Li
,
Guyue Zhou
,
Jianming HU
,
Xianyuan Zhan
PDF
Cite
Code
Project
Project
«
»
Cite
×