Reinforcement Learning As a Rehearsal For Swarm Foraging

Foraging in a swarm of robots has been investigated by many researchers, where the prevalent techniques have been hand-designed algorithms with parameters often tuned via machine learning. Our departure point is one such algorithm, where we replace a hand-coded decision procedure with reinforcement learning (RL), resulting in significantly superior performance. We situate our approach within the reinforcement learning as a rehearsal (RLaR) framework, that we have recently introduced. We instantiate RLaR for the foraging problem and experimentally show that a key component of RLaR—a conditional probability distribution function—can be modeled as a uni-modal distribution (with a lower memory footprint) despite evidence that it is multi-modal. Our experiments also show that the learned behavior has some degree of scalability in terms of variations in the swarm size or the environment.

Swarm Intelligence

