Skip to content

Commit

Permalink
Release branch for v0.101
Browse files Browse the repository at this point in the history
============================== Release Notes: v0.101 ==============================

Support for new training algorithms:

Support for new network structures:
 - ATOM VAE model
 - Graph neural networks
 - Graph Convolutional Networks (GCN)
 - 3D U-Net Model

Support for new layers:
 - Implemented optimized GRU layer using cuDNN kernel
 - Graph Layers: GCN, GIN, Graph, GatedGraph

Python front-end:
 - Support for Graph and Graph Convolutional Networks
 - Added support for OCLF data center (Summit)

Performance optimizations:
 - Optimize CUDA kernel for tensor reordering in GRU layer
 - Enabled TensorCore optimization for GRU layer
 - GCN and Graph layers also have a faster Dense variant which only utilizes Matrix Multiplication

Model portability & usability:
 - Added Users Quickstart section to documentation including PyTorch
   to LBANN mini-tutorial
 - Added section on callbacks with detailed instructions on summarize
   images callback

Internal features:
 - Support for double data type in distributed embedding layer
 - Support for large number of channels in GPU batchnorm layer
 - Modified LTFB so that NaNs lose tournaments
 - Improved numerical stability of reconstruction loss in ATOM VAE
   model
 - Skip bad gradients in Adam

I/O & data readers:
 - Added support for ImageNet data reader to use sample lists
 - Refactored sample list code to be more flexible and generalize
   beyond JAG data reader
 - Added support for slab-based I/O in HDF5 data reader required by
   DistConv implementations of CosmoFlow 3D volumes
 - Extended slab-based HDF5 data reader to support labels and
   reconstruction modes for use with U-Net architecture

Datasets:
 - Added two graph datasets (MNIST, and PROTEINS)

Build system and Dependent Libraries:
 - Hydrogen 1.4.0
 - Aluminum 0.4.0
 - Spack v0.15.4+ (Requires new format for environments)
 - cuDNN 8.0.2
 - Require C++14
 - Added Spack build support for OCLF data center (Summit)

Bug fixes:
 - Properly reset data coordinator after each LTFB round
 - Fixed bug in weights proxy when weights buffer is reallocated
 - Bugfix for smiles data reader bound checking and simple LTFB data
   distribution
 - Eliminated a race condition observed in VAE ATOM model with SMILES
   data reader.  Added a barrier after each data store mini-batch
   exchange -- avoid race between non-blocking sends and receives and
   later GPU kernel communication.

Retired features:
  • Loading branch information
bvanessen committed Sep 29, 2020
1 parent 97aa4d8 commit 6a0f8bf
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ endif ()
#

set(LBANN_VERSION_MAJOR 0)
set(LBANN_VERSION_MINOR 100)
set(LBANN_VERSION_MINOR 101)
set(LBANN_VERSION_PATCH 0)

set(LBANN_VERSION "${LBANN_VERSION_MAJOR}.${LBANN_VERSION_MINOR}.${LBANN_VERSION_PATCH}")
Expand Down
2 changes: 1 addition & 1 deletion ReleaseNotes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Bug fixes:

Retired features:

============================== (Pending) Release Notes: v0.101 ==============================
============================== Release Notes: v0.101 ==============================

Support for new training algorithms:

Expand Down

0 comments on commit 6a0f8bf

Please sign in to comment.