TY - JOUR
AU - Alves, Hirley
AB - Abstract:Multi-agent reinforcement learning (MARL) has been widely adopted in high-performance computing and complex data-driven decision-making in the wireless domain. However, conventional MARL schemes face many obstacles in real-world scenarios. First, most MARL algorithms are online, which might be unsafe and impractical. Second, MARL algorithms are environment-specific, meaning network configuration changes require model retraining. This letter proposes a novel meta-offline MARL algorithm that combines conservative Q-learning (CQL) and model agnostic meta-learning (MAML). CQL enables offline training by leveraging pre-collected datasets, while MAML ensures scalability and adaptability to dynamic network configurations and objectives. We propose two algorithm variants: independent training (M-I-MARL) and centralized training decentralized execution (M-CTDE-MARL). Simulation results show that the proposed algorithm outperforms conventional schemes, especially the CTDE approach that achieves 50 % faster convergence in dynamic scenarios than the benchmarks. The proposed framework enhances scalability, robustness, and adaptability in wireless communication systems by optimizing UAV trajectories and scheduling policies.
TI - Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection
JF - Computing Research Repository
DO - 10.48550/arxiv.2501.16098
DA - 2025-01-27
UR - https://www.deepdyve.com/lp/arxiv-cornell-university/multi-agent-meta-offline-reinforcement-learning-for-timely-uav-path-I7236Rz0s0
VL - 2025
IS - 2501
DP - DeepDyve
ER -