Title

Partitioning and Lifting in Multi-Agent Reinforcement Learning

Description

Many real-life scenarios can be simplified and modeled using Markov decision processes (MDPs). These MDPs set an agent with available actions into a well-defined environment and let it act in there to have a theoretical model representing the real world scenario. The agent has a policy defining which action to take in which situation. If it is a probabilistic model the policy consists of probabilities which actions to take in a specific state of the environment. Solving a MDP means to finding an optimal policy regarding a certain goal. The goal is implicitly defined by the reward function, which should reflect which actions were better than others in certain states. Reinforcement learning (RL) introduces the idea of letting the agent figure out the optimal strategy by itself by letting it act in the environment and learn from the rewards it has received from taking the actions it took. Building on the idea of placing an agent into an environment and letting it figure out how to act optimally, not all systems consist of one agent. Sometimes it is necessary to place multiple agents in an environment and let them act and explore to find an optimal combined policy. This introduces the topic of multi-agent reinforcement learning (MARL).

Sets generally can be partitioned into disjoint sets, and so can sets of agents. As defined by Braun et al. sets of agents can be partitioned in a way, that every agent in one partition has the same available actions and observations as every other agent and also uses the same policy. This technique has yet to be widespread in research of MARL, but has been used in general multi-agent settings. The goal of the thesis is to explore how this type-like information on agents can be used in MARL.

Requirements

Reinforcement learning

Person working on it

Simon Rabich

Category

Master thesis