This repository contains the code for the paper "Fleet Supervisor Allocation: A Submodular Maximization Approach" presented at CoRL 2024. This work introduces an adaptive allocation policy for robotic fleets to maximize data collection efficiency while accommodating uncertain connectivity.
This repository builds upon concepts from the Interactive Fleet Learning Benchmark repository, extending it to accommodate operational uncertainties in allocation policies.
Click to expand
In real-world scenarios, the data collected by robots in diverse and unpredictable environments is crucial for enhancing their models and policies. This data is predominantly collected under human supervision, particularly through imitation learning (IL), where robots learn complex tasks by observing human supervisors. However, the deployment of multiple robots and supervisors to accelerate the learning process often leads to data redundancy and inefficiencies, especially as the scale of robot fleets increases. Moreover, the reliance on teleoperation for supervision introduces additional challenges due to potential network connectivity issues. To address these inefficiencies and the reliability concerns of network-dependent supervision, we introduce an Adaptive Submodular Allocation policy, ASA, designed for efficient human supervision allocation within multi-robot systems under uncertain connectivity. Our approach reduces data redundancy by balancing the informativeness and diversity of data collection, and is capable of accommodating connectivity variances. We evaluated the effectiveness of ASA in simulation environments with 100 robots across four different environments and various network settings, including a real-world teleoperation scenario over a 5G network. We trained and tested both our and the state-of-the-art policies utilizing NVIDIA's Isaac Gym, and our results show that ASA enhances the return on human effort by up to 5.95×, outperforming current baselines in all simulated scenarios and providing robustness against connectivity disruptions.
Supervisor Allocation Problem: At each time step
Experimental Results: Here, each row represents a different network configuration and each column corresponds to a different environment. The performance is measured by Return on Human Effort (RoHE). Our ASA and n-ASA policies is affected least by changes in the network configurations due to their stochastic submodular maximization-based policies that can incorporate network uncertainties. The submodular maximization objective improves the performance when there are no network uncertainties due to its ability to cover diverse and informative scenarios. Additionally, ASA outperforms n-ASA in the Changing-Scarce network configuration thanks to its adaptive nature to network connectivity changes.
- Python 3.9.7
- Conda for managing dependencies
- NVIDIA Isaac Gym 1.0rc4: Download from NVIDIA Isaac Gym and place it under the
isaacgym
directory.
-
Clone the repository, then initialize the submodules if needed:
git submodule update --init
-
Create a Conda environment for the project:
conda env create -f environment.yml conda activate fsa
-
Install Isaac Gym and IsaacGymEnvs:
cd isaacgym/python pip install -e . cd ../../IsaacGymEnvs pip install -e .
Run the allocation policies for different environments using the provided scripts. To change the network configuration, modify the network_type
variable in the respective run_[ENV_NAME].sh
file.
- Humanoid:
. scripts/run_humanoid.sh
- Allegro Hand:
. scripts/run_allegro.sh
- ANYmal:
. scripts/run_anymal.sh
- Ball Balance:
. scripts/run_ballbalance.sh
All experiment logs are saved under logs/
. Each task has a dedicated directory (e.g., logs/humanoid
). To generate plots from experiment logs, run:
python plotting/plot.py logs/humanoid [METRIC]
Where [METRIC]
can be ROHE
, cumulative_success
, or other defined metrics.
If you use this code in your research, please cite our paper:
@inproceedings{
akcin2024fleet,
title={Fleet Supervisor Allocation: A Submodular Maximization Approach},
author={Oguzhan Akcin and Ahmet Ege Tanriverdi and Kaan Kale and Sandeep P. Chinchali},
booktitle={8th Annual Conference on Robot Learning},
year={2024},
url={https://openreview.net/forum?id=9dsBQhoqVr}
}