Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You and Your Team.

Learn More →

A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint

A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint In a teacher-student framework, a more experienced agent (teacher) helps accelerate the learning of another agent (student) by suggesting actions to take in certain states. In cooperative multi-agent reinforcement learning (MARL), where agents must cooperate with one another, a student could fail to cooperate effectively with others even by following a teacher’s suggested actions, as the policies of all agents can change before convergence. When the number of times that agents communicate with one another is limited (i.e., there are budget constraints), an advising strategy that uses actions as advice could be less effective. We propose a partaker-sharer advising framework (PSAF) for cooperative MARL agents learning with budget constraints. In PSAF, each Q-learner can decide when to ask for and share its Q-values. We perform experiments in three typical multi-agent learning problems. The evaluation results indicate that the proposed PSAF approach outperforms existing advising methods under both constrained and unconstrained budgets. Moreover, we analyse the influence of advising actions and sharing Q-values on agent learning. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Autonomous and Adaptive Systems (TAAS) Association for Computing Machinery

A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint

Loading next page...
 
/lp/association-for-computing-machinery/a-q-values-sharing-framework-for-multi-agent-reinforcement-learning-80IdIa4imR
Publisher
Association for Computing Machinery
Copyright
Copyright © 2021 ACM
ISSN
1556-4665
eISSN
1556-4703
DOI
10.1145/3447268
Publisher site
See Article on Publisher Site

Abstract

In a teacher-student framework, a more experienced agent (teacher) helps accelerate the learning of another agent (student) by suggesting actions to take in certain states. In cooperative multi-agent reinforcement learning (MARL), where agents must cooperate with one another, a student could fail to cooperate effectively with others even by following a teacher’s suggested actions, as the policies of all agents can change before convergence. When the number of times that agents communicate with one another is limited (i.e., there are budget constraints), an advising strategy that uses actions as advice could be less effective. We propose a partaker-sharer advising framework (PSAF) for cooperative MARL agents learning with budget constraints. In PSAF, each Q-learner can decide when to ask for and share its Q-values. We perform experiments in three typical multi-agent learning problems. The evaluation results indicate that the proposed PSAF approach outperforms existing advising methods under both constrained and unconstrained budgets. Moreover, we analyse the influence of advising actions and sharing Q-values on agent learning.

Journal

ACM Transactions on Autonomous and Adaptive Systems (TAAS)Association for Computing Machinery

Published: Apr 19, 2021

Keywords: Multi-agent reinforcement learning

References