Sample Efficient Offline RL via T-Symmetry Enforced Latent State-Stitching

Peng Cheng, Zhihao Wu, Jianxiong Li, Ziteng He, Haoran Xu, Wei Sun, Youfang Lin, Yunxin Liu, Xianyuan Zhan

January, 2026

Abstract

Offline reinforcement learning (RL) has achieved significant progress in recent years. However, most existing offline RL methods require a large amount of training data to achieve reasonable performance and offer limited generalizability in out-of-distribution (OOD) regions due to conservative data-related regularizations. This seriously hinders the usability of offline RL in solving many real-world applications, where the available data are often limited. In this study, we introduce a highly sample-efficient offline RL algorithm that enables state-stitching in a compact latent space regulated by the fundamental time-reversal symmetry (T-symmetry) of dynamical systems. Specifically, we introduce a T-symmetry enforced inverse dynamics model (TS-IDM) to derive well-regulated latent state representations that greatly facilitate OOD generalization. A guide-policy can then be learned entirely in the latent space to output the next state that maximizes the reward, bypassing the conservative action-level behavior constraints as adopted in most offline RL methods. Finally, the optimized action can be easily extracted by using the guide-policy’s output as the goal state in the learned TS-IDM. We call our method Offline RL via T-symmetry Enforced Latent State-Stitching (TELS). Our approach achieves amazing sample efficiency and OOD generalizability, significantly outperforming existing offline RL methods in a wide range of challenging small-sample tasks, even using as few as 1% of the data samples in D4RL datasets.

Type

Conference paper

Publication

In the 14th International Conference on Learning Representations (ICLR 2026)

Algorithms AIoT

Sample Efficient Offline RL via T-Symmetry Enforced Latent State-Stitching

Abstract

Peng Cheng

Research Intern

Jianxiong Li

PhD Candidate

Ziteng He

PhD student at Tsinghua University

Haoran Xu

PhD student at UT Austin, USA

Xianyuan Zhan

Faculty Member