V0.3.1 Release Note
The GraphStorm V0.3.1 release contains a few major feature enhancements. In this version, we have reorganized the overall documentation and tutorial to facilitate a more efficient learning curve for users. The new documentation is organized into four sections: i) Getting Started
, which offers a concise tutorial on usinh GraphStorm; ii) Command Line Interface User Guide
, which provides an overview of the GraphStorm command line interfaces (CLI); iii) Programming Interface User Guide
, which provides details the application programming interfaces (API) of GraphStorm; and vi) ) Advanced Topics
, which explores complex subjects such as custom model implementation, link prediction training optimization, multi-task learning, etc. In addition, we have enhanced the distributed graph processing functionalities to improve user experience. We provided four notebook examples to demonstrate the use of GraphStorm APIs in developing custom models and training/inference pipelines.
Major features
- Reorganized the documentations and tutorials to group the main contents under two top-level menus, i.e.,
COMMAND LINE INTERFACE USER GUIDE
andPROGRAMMING INTERFACE USER GUIDE
. #956- Under the CLI user guide menu, regrouped the contents in into two 2nd-level menus, i.e.,
GraphStorm Graph Construction
andGraphStorm Model Training and Inference
.- Under the
GraphStorm Graph Construction
, added a new document,Input Raw Data Specification
, to explain the specifications of the input data, and provide a simple raw data example. #996 - Added a new document,
Single Machine Graph Construction
, to introduce thegconstruct
module, and provide a simple construction configuration JSON example. #996 - In the
Distributed Graph Construction
, reorganized the document structure of GSProcessing. #907
- Under the
- Renamed the
DISTRIBUTED TRAINING
toGraphStorm Model Training and Inference
and move it underCOMMAND LINE INTERFACE USER GUIDE
. #956- Added a new
Model Training and Inference on a Single Machine
2nd-level menu to explain the launch commands.
- Added a new
- Under the
PROGRAMMING INTERFACE USER GUIDE
menu, - Refined hard negative tutorial and multi-task learning tutorial. #898 #944
- Under the CLI user guide menu, regrouped the contents in into two 2nd-level menus, i.e.,
- Added a new GSProcessing launch script for EMR on EC2 that allows users to run a GSProcessing job as an EMR step, simplifying the user experience. #902
New examples
- Add a Jupyter Notebook example for using GraphStorm APIs to implement GraphStorm built-in GNN model #919
- Add a Jupyter Notebook example for using GraphStorm APIs to customize GNN model components #929
Minor features
- Add a hit@k evaluator for both classification and link prediction tasks. #911 #948
- Remove the limit that save model frequency must be dividable by the evaluation frequency. Allow users to set the save model frequency freely. #893 #948
- Added a new
truncate_dim
argument to GSProcessing no-op transformation and forgconstruct.construct_graph
too. #922
Breaking changes
- Add a new argument norm in the
__init__
of GraphStorm classification and regression decoders. This allows users to set layer or batch normalization on the neural network layers of these decoders. OnlyMLPFeatEdgeDecoder
implements the normalization in this release. #948 - Rename the
pos_graph_feat_fields
withpos_graph_edge_feat_fields
in theGSgnnLinkPredictionDataLoaderBase
class to make its meaning clearer. #934
Contributors
- Xiang Song from AWS
- Jian Zhang from AWS
- Theodore Vasiloudis from AWS
- Runjie Ma from AWS
- Han Xie from AWS