AIR-DREAM Lab
AIR-DREAM Lab
Home
News
Researches
Publications
People
Light
Dark
Automatic
Algorithms
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
DOGE marries dataset geometry with deep function approximators in offline RL, and enables exploitation in generalizable OOD areas rather than strictly constraining policy within data distribution.
Jianxiong Li
,
Xianyuan Zhan
,
Haoran Xu
,
Xiangyu Zhu
,
Jingjing Liu
,
Ya-Qin Zhang
PDF
Cite
Code
Project
An Efficient Multi-Agent Optimization Approach for Coordinated Massive MIMO Beamforming
Beamforming plays an important role in 5G Massive Multiple-Input Multiple-Output (MMIMO) communications. Optimizing beamforming …
Li Jiang
,
Xiangsen Wang
,
Aidong Yang
,
Xidong Wang
,
Xiaojia Jin
,
Wei Wang
,
Xiaozhou Ye
,
Ye Ouyang
,
Xianyuan Zhan
PDF
Cite
Project
Project
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
This paper proposes an offline policy optimization approach for imperfect rewards. Abstract: Reward function is essential in …
Jianxiong Li
,
Xiao Hu
,
Haoran Xu
,
Jingjing Liu
,
Xianyuan Zhan
,
Qing-Shan Jia
,
Ya-Qin Zhang
PDF
Cite
Project
Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization
Offline reinforcement learning (RL) that learns policies from offline datasets without environment interaction has received …
Xiangsen Wang
,
Xianyuan Zhan
PDF
Cite
Project
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
H2O introduces a dynamics-aware policy evaluation scheme, which adaptively penalizes the Q function learning on simulated state-action pairs with large dynamics gaps, while also simultaneously allowing learning from a fixed real-world dataset.
Haoyi Niu
,
Shubham Sharma
,
Yiwen Qiu
,
Ming Li
,
Guyue Zhou
,
Jianming HU
,
Xianyuan Zhan
PDF
Cite
Code
Project
Project
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based and Imitation-based. RL-based methods …
Haoran Xu
,
Li Jiang
,
Jianxiong Li
,
Xianyuan Zhan
PDF
Cite
Code
Project
Discriminator-Guided Model-Based Offline Imitation Learning
Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward …
Wenjia Zhang
,
Haoran Xu
,
Haoyi Niu
,
Peng Cheng
,
Ming Li
,
Heming Zhang
,
Guyue Zhou
,
Xianyuan Zhan
PDF
Cite
Project
Project
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an optimal expert behavior policy without …
Haoran Xu
,
Xianyuan Zhan
,
Honglei Yin
,
Huiling Qin
PDF
Cite
Code
Project
Model-Based Offline Planning with Trajectory Pruning
Offline reinforcement learning (RL) enables learning policies using pre-collected datasets without environment interaction, which …
Xianyuan Zhan
,
Xiangyu Zhu
,
Haoran Xu
PDF
Cite
Code
Project
Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while …
Haoran Xu
,
Xianyuan Zhan
,
Xiangyu Zhu
PDF
Cite
Project
«
»
Cite
×