Has anyone tried implementing the multi-agent RL algorithm MADDPG (I've linked the paper below)? The paper seems to have a good amount citations, and they do have their code on github. However, a few people on the internet have mentioned that while this algorithm works fine with the particle environment used in the paper, it does not work for other environments. Has anyone here tried to implement MADDPG for a different environment and succeeded?
For reference, I am working on a multi-agent reinforcement learning problem with heterogeneous agents.
Paper: Multi-agent actor-critic for mixed cooperative-competitive environments