HER with online and offline sampling, bug fixes for features extraction
Pre-release
Pre-release
Breaking Changes
- Warning: Renamed
common.cmd_util
tocommon.env_util
for clarity (affectsmake_vec_env
andmake_atari_env
functions)
New Features
- Allow custom actor/critic network architectures using
net_arch=dict(qf=[400, 300], pi=[64, 64])
for off-policy algorithms (SAC, TD3, DDPG) - Added Hindsight Experience Replay
HER
. (@megan-klaiber) VecNormalize
now supportsgym.spaces.Dict
observation spaces- Support logging videos to Tensorboard (@SwamyDev)
- Added
share_features_extractor
argument toSAC
andTD3
policies
Bug Fixes
- Fix GAE computation for on-policy algorithms (off-by one for the last value) (thanks @Wovchena)
- Fixed potential issue when loading a different environment
- Fix ignoring the exclude parameter when recording logs using json, csv or log as logging format (@SwamyDev)
- Make
make_vec_env
support theenv_kwargs
argument when using an env ID str (@ManifoldFR) - Fix model creation initializing CUDA even when
device="cpu"
is provided - Fix
check_env
not checking if the env has a Dict actionspace before calling_check_nan
(@wmmc88) - Update the check for spaces unsupported by Stable Baselines 3 to include checks on the action space (@wmmc88)
- Fixed feature extractor bug for target network where the same net was shared instead
of being separate. This bug affectsSAC
,DDPG
andTD3
when usingCnnPolicy
(or custom feature extractor) - Fixed a bug when passing an environment when loading a saved model with a
CnnPolicy
, the passed env was not wrapped properly
(the bug was introduced when implementingHER
so it should not be present in previous versions)
Others
- Improved typing coverage
- Improved error messages for unsupported spaces
- Added
.vscode
to the gitignore