I am one of the core developers of JuliaPOMDP. The framework is a collection of algorithms and support tools for working with and solving Markov decision processes and their partially observable counterparts. It is written in Julia, and is aimed at both novices and experts in probabilistic planning and reinforcement learning.


The multi-agent deep reinforcement learning (MADRL) framework implements various multi-agent reinforcement learning environments. The framework implements multi-agent extensions of DQN, TRPO, and A3C.


Chimp is a modular framework for deep reinfrocement learning written in Python. It integrates with a number of existing deep learning frameworks (TenserFlow, Theano, Chainer) and provides a number of example models to get started.


POMDPs.jl: A Framework for Sequential Decision Making under Uncertainty

Maxim Egorov, Zachary Sunberg, Edward Balaban, Tim Wheeler, Jayesh Gupta and Mykel Kochenderfer
Journal of Machine Learning Research, 2017
POMDPs.jl is an open-source framework for solving Markov decision processes (MDPs) and partially observable MDPs (POMDPs). POMDPs.jl allows users to specify sequential decision making problems with minimal effort without sacrificing the expressive nature of POMDPs, making this framework viable for both educational and research purposes.

Target Surveillance in Adversarial Environments Using POMDPs

Maxim Egorov, Mykel Kochenderfer and Jaak Uudmae
AAAI Conference on Artificial Intelligence (AAAI), 2016
This paper introduces an extension of the target surveillance problem in which the surveillance agent is exposed to an adversarial ballistic threat. The problem is formulated as a partially observable Markov decision process. We evaluate the performance of our algorithm against humans in a target surveillance video game.

Class Projects

Multi-Agent Deep Reinforcement Learning

This work introduces a novel approach for solving reinforcement learning problems in multi-agent settings. We propose a state reformulation of multi-agent problems that allows the system state to be represented in an image-like fashion. We then apply deep reinforcement learning techniques with a convolution neural network as the Q-value function approximator to learn distributed multi-agent policies.

Deep Reinforcement Learning in POMDPs

This project introduced a novel approach of solving partially observable Markov decision processes using deep Q-Networks (DQNs). We demonstrated that DQNs can learn good policies, but require significantly more computational power. We also showed that while the Q-values converge, the policies are sensitive to small perturbations, and do not converge even after long training cycles.