We provide the following three multi-agent extensions to the Soft Actor-Critic (SAC) algorithm.
ISAC
is an implementation following the independent learners MARL paradigm while MASAC
is an implementation that follows the centralised training with decentralised execution paradigm by having a centralised critic during training. HASAC
follows the heterogeneous agent learning paradigm through sequential policy updates. The ff
prefix to the algorithm names indicate that the algorithms use MLP-based policy networks.