Paper: Improving Reinforcement Learning-based Autonomous Agents with Causal Models
Abstract: Autonomous Agents trained with Reinforcement Learning (RL) must explore the effects of their actions in different environment states to learn optimal control policies or build a model of such environment. Exploration may be impractical in complex environments, hence ways to prune the exploration space must be found. In this paper, we propose to augment an autonomous agent with a causal model of the core dynamics of its environment, learnt on a simplified version of it and then used as a “driving assistant” for larger or more complex environments. Experiments with different RL algorithms, in increasingly complex environments, and with different exploration strategies, show that learning such a model improves the agent behaviour.
Keywords: Autonomous Agents, Causal Discovery, Reinforcement Learning
Conference Presentation: slides
Maintainer: Giovanni Briglia
Affiliation: Distributed and Pervasive Intelligence Group at University of Modena and Reggio Emilia
Contact: [email protected] and [email protected]
- Create a new python virtual environment with 'python 3.10'
- Install 'requirements'
pip install -r requirements.txt
- Install setup
python install
- Run test example
python3.10 -m scripts/
For comparison: Vanilla vs Causal Offline vs Causal Online in Grid-like Environments:
python install
python3.10 -m scripts/launchers/
For comparison: With and Without Transfer Learning in Maze-like Environments:
python install
python3.10 -m scripts/launchers/
Your extension can take various paths:
- One direction involves modifying the causal discovery algorithms.
- Another direction entails adding new kinds of agent (currently Q-Learning and DQN have been developed). It's crucial to maintain consistency with the training class by implementing the "update_Q_or_memory", "update_exp_fact", "select_action" and "return_q_table" functions. Additionally, in the "" script, you need to include your custom label.
- The third direction involves testing new environments.
[4] Colas, C., Sigaud, O., Oudeyer, P.Y. (2018). GEP-PG: Decoupling exploration and exploitation in deep reinforcement learning algorithms. In Proceedings of the International Conference on Machine Learning, PMLR, 1039–1048.
[5] Eimer, T., Lindauer, M., Raileanu, R. (2023). Hyperparameters in reinforcement learning and how to tune them. In Proceedings of the 40th International Conference on Machine Learning, PMLR. Available at:
[7] Gorsane, R., Mahjoub, O., de Kock, R.J., Dubb, R., Singh, S., Pretorius, A. (2022). Towards a standardised performance evaluation protocol for cooperative MARL. Advances in Neural Information Processing Systems, 35, 5510–5521.
title={Improving Reinforcement Learning-Based Autonomous Agents with Causal Models},
author={Briglia, Giovanni and Lippi, Marco and Mariani, Stefano and Zambonelli, Franco},
booktitle={International Conference on Principles and Practice of Multi-Agent Systems},