Update links for the CPM road traffic scenario (#150)

proroklab · Nov 27, 2024 · 1d9e926 · 1d9e926
1 parent 2ef3b7b
commit 1d9e926
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -387,7 +387,7 @@ To create a fake screen you need to have `Xvfb` installed.
 | `navigation.py`         | Randomly spawned agents need to navigate to their goal. Collisions can be turned on and agents can use LIDARs to avoid running into each other. Rewards can be shared or individual. Apart from position, velocity, and lidar readings, each agent can be set up to observe just the relative distance to its goal, or its relative distance to *all* goals (in this case the task needs heterogeneous behavior to be solved). The scenario can also be set up so that multiple agents share the same goal.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | <img src="https://github.com/matteobettini/vmas-media/blob/main/media/scenarios/navigation.gif?raw=true" alt="drawing" width="300"/>           |
 | `sampling.py`           | `n_agents` are spawned randomly in a workspace with an underlying gaussian density function composed of `n_gaussians` modes. Agents need to collect samples by moving in this field. The field is discretized to a grid and once an agent visits a cell its sample is collected without replacement and given as reward to the whole team (or just to the agent if `shared_rew=False`). Agents can use a lidar to sens each other. Apart from lidar, position and velocity observations, each agent observes the values of samples in the 3x3 grid around it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | <img src="https://github.com/matteobettini/vmas-media/blob/main/media/scenarios/sampling.gif?raw=true" alt="drawing" width="300"/>             |
 | `wind_flocking.py`      | Two agents need to flock at a specified distance northwards. They are rewarded for their distance and the alignment of their velocity vectors to the reference. The scenario presents wind from north to south. The agents present physical heterogeneity: the smaller one has some aerodynamical properties and can shield the bigger one from wind, thus optimizing the flocking performance. Thus, the optimal solution to this task consists in the agents performing heterogeneous wind shielding. See the [SND paper](https://matteobettini.github.io/publication/system-neural-diversity-measuring-behavioral-heterogeneity-in-multi-agent-learning/) for more info.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | <img src="https://github.com/matteobettini/vmas-media/blob/main/media/scenarios/wind_flocking.gif?raw=true" alt="drawing" width="300"/>        |
-| `road_traffic.py`       | This scenario provides a MARL benchmark for Connected and Automated Vehicles (CAVs) using a High-Definition (HD) map from the Cyber-Physical Mobility Lab ([CPM Lab](https://cpm.embedded.rwth-aachen.de/)), an open-source testbed for CAVs. The map features an eight-lane intersection and a loop-shaped highway with multiple merge-in and -outs, offering a range of challenging traffic conditions. Forty loop-shaped reference paths are predefined, allowing for simulations with infinite durations. You can initialize up to 100 agents, with a default number of 20. In the event of collisions during training, the scenario reinitializes all agents, randomly assigning them new reference paths, initial positions, and speeds. This setup is designed to simulate the unpredictability of real-world driving. Besides, the observations are designed to promote sample efficiency and generalization (i.e., agents' ability to generalize to unseen scenarios). In addition, both ego view and bird's-eye view are implemented; partial observation is also supported to simulate partially observable Markov Decision Processes. See [this paper](http://dx.doi.org/10.13140/RG.2.2.24505.17769) for more info. | <img src="https://github.com/matteobettini/vmas-media/blob/main/media/scenarios/road_traffic_cpm_lab.gif?raw=true" alt="drawing" width="300"/> |
+| `road_traffic.py`       | This scenario provides a MARL benchmark for Connected and Automated Vehicles (CAVs) using a High-Definition (HD) map from the Cyber-Physical Mobility Lab ([CPM Lab](https://cpm.embedded.rwth-aachen.de/)), an open-source testbed for CAVs. The map features an eight-lane intersection and a loop-shaped highway with multiple merge-in and -outs, offering a range of challenging traffic conditions. Forty loop-shaped reference paths are predefined, allowing for simulations with infinite durations. You can initialize up to 100 agents, with a default number of 20. In the event of collisions during training, the scenario reinitializes all agents, randomly assigning them new reference paths, initial positions, and speeds. This setup is designed to simulate the unpredictability of real-world driving. Besides, the observations are designed to promote sample efficiency and generalization (i.e., agents' ability to generalize to unseen scenarios). In addition, both ego view and bird's-eye view are implemented; partial observation is also supported to simulate partially observable Markov Decision Processes. See [this paper](https://arxiv.org/abs/2408.07644) for more info. | <img src="https://github.com/matteobettini/vmas-media/blob/main/media/scenarios/road_traffic_cpm_lab.gif?raw=true" alt="drawing" width="300"/> |
 
 #### Debug scenarios
 

diff --git a/vmas/scenarios/road_traffic.py b/vmas/scenarios/road_traffic.py
@@ -21,7 +21,7 @@
 class Scenario(BaseScenario):
     """
     This scenario originally comes from the paper "Xu et al. - 2024 - A Sample Efficient and Generalizable Multi-Agent Reinforcement Learning Framework
-    for Motion Planning" (http://dx.doi.org/10.13140/RG.2.2.24505.17769, see also its GitHub repo https://github.com/cas-lab-munich/generalizable-marl/tree/1.0.0),
+    for Motion Planning" (https://arxiv.org/abs/2408.07644, see also its GitHub repo https://github.com/bassamlab/SigmaRL),
     which aims to design an MARL framework with efficient observation design to enable fast training and to empower agents the ability to generalize
     to unseen scenarios.