TY - JOUR AU1 - Wu, Bohan AU2 - Gupta, Jayesh K. AU3 - Kochenderfer, Mykel AB - Learning interpretable and transferable subpolicies and performing task decomposition from a single, complex task is difficult. Such decomposition can lead to immense sample efficiency gains in lifelong learning. Some traditional hierarchical reinforcement learning techniques enforce this decomposition in a top-down manner, while meta-learning techniques require a task distribution at hand to learn such decompositions. This article presents a framework for using diverse suboptimal world models to decompose complex task solutions into simpler modular subpolicies. Given these world models, this framework performs decomposition of a single source task in a bottom up manner, concurrently learning the required modular subpolicies as well as a controller to coordinate them. We perform a series of experiments on high dimensional continuous action control tasks to demonstrate the effectiveness of this approach at both complex single-task learning and lifelong learning. Finally, we perform ablation studies to understand the importance and robustness of different elements in the framework and limitations to this approach. TI - Model primitives for hierarchical lifelong reinforcement learning JF - Autonomous Agents and Multi-Agent Systems DO - 10.1007/s10458-020-09451-0 DA - 2020-02-25 UR - https://www.deepdyve.com/lp/springer-journals/model-primitives-for-hierarchical-lifelong-reinforcement-learning-8UFlhwt2Y7 VL - 34 IS - 1 DP - DeepDyve ER -