Skip to content

v0.94

Compare
Choose a tag to compare
@bvanessen bvanessen released this 12 Feb 01:33
· 4540 commits to develop since this release

============================== Release Notes: v0.94 ==============================

Support for new training algorithms:

  • Back-Propagation Through Time (BPTT)
    -- Recurrent Neural Networks (RNN)
    -- Long Short-Term Memories (LSTM)
  • Generative Adversarial Networks (GAN)
  • Variational autoencoders
  • Convolutional autoencoders
  • Fine tuning of pretrained networks
    -- Flexible weight freezing
  • Context-prediction network (Siamese network)
  • Livermore Tournament Fast Batch learning (LTFB)
  • Variable mini-batch sizes

Support for new network structures

  • Directed Acyclic Graph (DAG) networks
  • Residual networks
  • Modular and composable objective functions
  • Multiple metrics
  • Shared weight matrices
  • (BETA) New evaluation layer that is attach to any point of DAG
  • Motifs (compound, reused network patterns)

Support for new layers:

  • Learning:
    • Deconvolution
  • Metrics:
    -- Top K Categorical accuracy, Pearson correlation, Mean absolute deviation
  • Loss Functions:
    -- Cross Entropy with Uncertainty, Geometric negative log likelihood
    -- Poisson Negative log likelihood, Polya Negative Log Likelihood
  • Optimizers:
    -- Hypergradient Adam
  • Transform Layers:
    -- Contatenation, Noise, Unpooling, Pooling, Reshape, Slice, Split, Sum
  • Regularizer:
    -- Batch Normalization, Selu Dropout, Local Response Normalization (LRN)
  • Activations:
    -- Leaky Relu, Smooth Relu, Elu, Scaled Elu, Softplus, Atan,
    -- Bent Identity, Exponential

Performance optimizations:

  • GPU acceleration for most layers
  • NCCL 2.X
  • Optimized communication patterns
  • Asynchronous weight updates
  • Asynchronous metric and objective function updates
  • batch normalization (global and local)
  • L2 normalization
  • Adaptive Quantization (inter-model)

Model portablility & usability:

  • Portable checkpoints / recovery
  • Distributed checkpoint / recovery
  • Network visualization
  • Export LBANN to TensorFlow format

Internals Features:

  • Gradient checking
  • Network representation using tensor dimensions
  • Bamboo continuous integration (CI)
  • Improved data processing pipeline

New data readers:

  • Numpy
  • CSV
  • Methods for merging multiple features and samples across files
  • CANDLE Pilot 2
  • CANDLE Pilot 1 Combo
  • ICF JAG

Integration with Hydrogen, an optimized distributed, dense linear algebra
library. Hydrogen is a fork of the Elemental library. Hydrogen optimizes for:
distributed matrices with elemental and block distributions, BLAS, LAPACK,
distributed and local matrix management.

Integration with optimized all-reduce communication library Aluminum. Aluminum
provides custom reduction patterns, customized CUDA reduction kernels,
and asynchronous communication operators. It uses MPI, MPI w/GPUdirect, or NCCL
as back-end libraries. Aluminum enables us to effectively use non-blocking
all-reduces during backprop/optimization

Addtionally, we have added support for an online, distributed data store. When
enabled, LBANN is able to ingest all of the training data set in a distributed
method across all ranks. Each data store is then able to serve it's portion of
a mini-batch, dynamically moving data to the necessary ranks in the model (based
on the mini-batch data distribution).