In this project, we form a swarm robots region searching problem as a multi-armed bandited problem then solve it with Thompson sampling algorithm.
The searching region is limited and known to agents. Multiple targets randomly move, appear, or disappear in the region according to certain probability distribution. The swarm system get a reward when they find a target.
Detailed results can be find here https://ieeexplore.ieee.org/document/9596890