Release branch for v0.101 · LLNL/lbann@6a0f8bf

Commit

Release branch for v0.101

============================== Release Notes: v0.101 ==============================

Support for new training algorithms:

Support for new network structures:
 - ATOM VAE model
 - Graph neural networks
 - Graph Convolutional Networks (GCN)
 - 3D U-Net Model

Support for new layers:
 - Implemented optimized GRU layer using cuDNN kernel
 - Graph Layers: GCN, GIN, Graph, GatedGraph

Python front-end:
 - Support for Graph and Graph Convolutional Networks
 - Added support for OCLF data center (Summit)

Performance optimizations:
 - Optimize CUDA kernel for tensor reordering in GRU layer
 - Enabled TensorCore optimization for GRU layer
 - GCN and Graph layers also have a faster Dense variant which only utilizes Matrix Multiplication

Model portability & usability:
 - Added Users Quickstart section to documentation including PyTorch
   to LBANN mini-tutorial
 - Added section on callbacks with detailed instructions on summarize
   images callback

Internal features:
 - Support for double data type in distributed embedding layer
 - Support for large number of channels in GPU batchnorm layer
 - Modified LTFB so that NaNs lose tournaments
 - Improved numerical stability of reconstruction loss in ATOM VAE
   model
 - Skip bad gradients in Adam

I/O & data readers:
 - Added support for ImageNet data reader to use sample lists
 - Refactored sample list code to be more flexible and generalize
   beyond JAG data reader
 - Added support for slab-based I/O in HDF5 data reader required by
   DistConv implementations of CosmoFlow 3D volumes
 - Extended slab-based HDF5 data reader to support labels and
   reconstruction modes for use with U-Net architecture

Datasets:
 - Added two graph datasets (MNIST, and PROTEINS)

Build system and Dependent Libraries:
 - Hydrogen 1.4.0
 - Aluminum 0.4.0
 - Spack v0.15.4+ (Requires new format for environments)
 - cuDNN 8.0.2
 - Require C++14
 - Added Spack build support for OCLF data center (Summit)

Bug fixes:
 - Properly reset data coordinator after each LTFB round
 - Fixed bug in weights proxy when weights buffer is reallocated
 - Bugfix for smiles data reader bound checking and simple LTFB data
   distribution
 - Eliminated a race condition observed in VAE ATOM model with SMILES
   data reader.  Added a barrier after each data store mini-batch
   exchange -- avoid race between non-blocking sends and receives and
   later GPU kernel communication.

Retired features:

Loading branch information

bvanessen committed Sep 29, 2020

1 parent 97aa4d8 commit 6a0f8bf

CMakeLists.txt

-Original file line number
+Diff line change
@@ Expand Up / @@ -48,7 +48,7 @@ endif () @@
     #
     set(LBANN_VERSION_MAJOR 0)
-    set(LBANN_VERSION_MINOR 100)
+    set(LBANN_VERSION_MINOR 101)
     set(LBANN_VERSION_PATCH 0)
     set(LBANN_VERSION "${LBANN_VERSION_MAJOR}.${LBANN_VERSION_MINOR}.${LBANN_VERSION_PATCH}")
@@ Expand Down @@

ReleaseNotes.txt

-Original file line number
+Diff line change
@@ Expand Up / @@ -21,7 +21,7 @@ Bug fixes: @@
     Retired features:
-    ============================== (Pending) Release Notes: v0.101 ==============================
+    ============================== Release Notes: v0.101 ==============================
     Support for new training algorithms:
@@ Expand Down @@

0 comments on commit `6a0f8bf`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `6a0f8bf`

Commit

There are no files selected for viewing

0 comments on commit 6a0f8bf

0 comments on commit `6a0f8bf`