Faculty Publications

Pruning for Monte Carlo Distributed Reinforcement Learning In Decentralized POMDPs

Bikramjit Banerjee, University of Southern MississippiFollow

Document Type

Conference Proceeding

Publication Date

12-1-2013

School

Computing Sciences and Computer Engineering

Abstract

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a powerful modeling technique for realistic multi-agent coordination problems under uncertainty. Prevalent solution techniques are centralized and assume prior knowledge of the model. Recently a Monte Carlo based distributed reinforcement learning approach was proposed, where agents take turns to learn best responses to each other's policies. This promotes decentralization of the policy computation problem, and relaxes reliance on the full knowledge of the problem parameters. However, this Monte Carlo approach has a large sample complexity, which we address in this paper. In particular, we propose and analyze a modified version of the previous algorithm that adaptively eliminates parts of the experience tree from further exploration, thus requiring fewer samples while ensuring unchanged confidence in the learned value function. Experiments demonstrate significant reduction in sample complexity - the maximum reductions ranging from 61% to 91% over different benchmark Dec-POMDP problems - with the final policies being often better due to more focused exploration. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Publication Title

Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013

First Page

Last Page

Recommended Citation

Banerjee, B. (2013). Pruning for Monte Carlo Distributed Reinforcement Learning In Decentralized POMDPs. Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013, 88-94.
Available at: https://aquila.usm.edu/fac_pubs/20262

Link to Full Text

COinS

Faculty Publications

Pruning for Monte Carlo Distributed Reinforcement Learning In Decentralized POMDPs

Document Type

Publication Date

School

Abstract

Publication Title

First Page

Last Page

Recommended Citation

Search

Browse

Author Corner

Faculty Publications

Pruning for Monte Carlo Distributed Reinforcement Learning In Decentralized POMDPs

Authors

Document Type

Publication Date

School

Abstract

Publication Title

First Page

Last Page

Recommended Citation

Share

Search

Browse

Author Corner