Skip to content

Commit

Permalink
Added documentation for release 1.33.0
Browse files Browse the repository at this point in the history
Signed-off-by: Bharath Ramaswamy <[email protected]>
  • Loading branch information
quic-bharathr committed Sep 12, 2024
1 parent 8de99be commit 1516933
Show file tree
Hide file tree
Showing 259 changed files with 3,256 additions and 5,104 deletions.
58 changes: 25 additions & 33 deletions releases/1.33.0/Examples/onnx/quantization/adaround.html
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
<!DOCTYPE html>
<html class="writer-html5" lang="en">
<head>
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />

<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Adaptive Rounding (AdaRound) &mdash; AI Model Efficiency Toolkit Documentation: ver 1.33.0</title>
<link rel="stylesheet" type="text/css" href="../../../_static/pygments.css" />
Expand Down Expand Up @@ -64,7 +63,6 @@
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/quantization_sim.html#determining-quantization-parameters-encodings">Determining Quantization Parameters (Encodings)</a></li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/quantization_sim.html#quantization-schemes">Quantization Schemes</a></li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/quantization_sim.html#configuring-quantization-simulation-ops">Configuring Quantization Simulation Ops</a></li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/quantization_sim.html#quantization-simulation-apis">Quantization Simulation APIs</a></li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/quantization_sim.html#frequently-asked-questions">Frequently Asked Questions</a></li>
</ul>
</li>
Expand All @@ -79,34 +77,29 @@
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/auto_quant.html">AutoQuant</a><ul>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/auto_quant.html#overview">Overview</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/auto_quant.html#workflow">Workflow</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/auto_quant.html#autoquant-api">AutoQuant API</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/adaround.html">Adaptive Rounding (AdaRound)</a><ul>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/adaround.html#adaround-use-cases">AdaRound Use Cases</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/adaround.html#common-terminology">Common terminology</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/adaround.html#use-cases">Use Cases</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/adaround.html#adaround-api">AdaRound API</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html">Cross-Layer Equalization</a><ul>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#overview">Overview</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#user-flow">User Flow</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#cross-layer-equalization-api">Cross-Layer Equalization API</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#faqs">FAQs</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#references">References</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/bn_reestimation.html">BN Re-estimation</a><ul>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/bn_reestimation.html#overview">Overview</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/bn_reestimation.html#workflow">Workflow</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/bn_reestimation.html#bn-re-estimation-api">BN Re-estimation API</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html">Bias Correction [Depricated]</a><ul>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#overview">Overview</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#user-flow">User Flow</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#cross-layer-equalization-api">Cross-Layer Equalization API</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#faqs">FAQs</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/post_training_quant_techniques.html#references">References</a></li>
</ul>
Expand All @@ -118,7 +111,6 @@
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/quant_analyzer.html#overview">Overview</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/quant_analyzer.html#requirements">Requirements</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/quant_analyzer.html#detailed-analysis-descriptions">Detailed Analysis Descriptions</a></li>
<li class="toctree-l5"><a class="reference internal" href="../../../user_guide/quant_analyzer.html#quantanalyzer-api">QuantAnalyzer API</a></li>
</ul>
</li>
<li class="toctree-l4"><a class="reference internal" href="../../../user_guide/visualization_quant.html">Visualizations</a><ul>
Expand Down Expand Up @@ -1114,24 +1106,24 @@
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">

<section id="Adaptive-Rounding-(AdaRound)">
<div class="section" id="Adaptive-Rounding-(AdaRound)">
<h1>Adaptive Rounding (AdaRound)<a class="headerlink" href="#Adaptive-Rounding-(AdaRound)" title="Permalink to this heading"></a></h1>
<p>This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).</p>
<p>AIMET quantization features typically use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value.</p>
<p>AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one. Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.</p>
<section id="Overall-flow">
<div class="section" id="Overall-flow">
<h2>Overall flow<a class="headerlink" href="#Overall-flow" title="Permalink to this heading"></a></h2>
<p>This notebook covers the following: 1. Instantiate the example evaluation and training pipeline 2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score</p>
</section>
<section id="What-this-notebook-is-not">
</div>
<div class="section" id="What-this-notebook-is-not">
<h2>What this notebook is not<a class="headerlink" href="#What-this-notebook-is-not" title="Permalink to this heading"></a></h2>
<ul class="simple">
<li><p>This notebook is not designed to show state-of-the-art results</p></li>
<li><p>For example, it uses a relatively quantization-friendly model like Resnet18</p></li>
<li><p>Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly</p></li>
</ul>
<hr class="docutils" />
<section id="Dataset">
<div class="section" id="Dataset">
<h3>Dataset<a class="headerlink" href="#Dataset" title="Permalink to this heading"></a></h3>
<p>This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, use that. Otherwise, download the dataset from an appropriate location (e.g. <a class="reference external" href="https://image-net.org/challenges/LSVRC/2012/index">https://image-net.org/challenges/LSVRC/2012/index</a>.php#).</p>
<p><strong>Note1</strong>: The dataloader provided in this example notebook relies on the ImageNet dataset having the following characteristics: - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the <a class="reference external" href="https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html">pytorch dataset description</a> for more details. - A subdirectory per class, and a file per each image sample.</p>
Expand All @@ -1145,9 +1137,9 @@ <h3>Dataset<a class="headerlink" href="#Dataset" title="Permalink to this headin
</pre></div>
</div>
</div>
</section>
</div>
<hr class="docutils" />
<section id="1.-Example-evaluation-and-training-pipeline">
<div class="section" id="1.-Example-evaluation-and-training-pipeline">
<h3>1. Example evaluation and training pipeline<a class="headerlink" href="#1.-Example-evaluation-and-training-pipeline" title="Permalink to this heading"></a></h3>
<p>The following is an example training and validation loop for this image classification task.</p>
<ul class="simple">
Expand Down Expand Up @@ -1192,9 +1184,9 @@ <h3>1. Example evaluation and training pipeline<a class="headerlink" href="#1.-E
</pre></div>
</div>
</div>
</section>
</div>
<hr class="docutils" />
<section id="2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy">
<div class="section" id="2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy">
<h3>2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy<a class="headerlink" href="#2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy" title="Permalink to this heading"></a></h3>
<p>For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether.</p>
<div class="nbinput nblast docutils container">
Expand Down Expand Up @@ -1257,12 +1249,12 @@ <h3>2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline
</pre></div>
</div>
</div>
</section>
</div>
<hr class="docutils" />
<section id="3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy">
<div class="section" id="3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy">
<h3>3. Create a quantization simulation model and determine quantized accuracy<a class="headerlink" href="#3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy" title="Permalink to this heading"></a></h3>
</section>
<section id="Fold-Batch-Normalization-layers">
</div>
<div class="section" id="Fold-Batch-Normalization-layers">
<h3>Fold Batch Normalization layers<a class="headerlink" href="#Fold-Batch-Normalization-layers" title="Permalink to this heading"></a></h3>
<p>Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.</p>
<p><strong>Why do we need to this?</strong></p>
Expand All @@ -1280,8 +1272,8 @@ <h3>Fold Batch Normalization layers<a class="headerlink" href="#Fold-Batch-Norma
</pre></div>
</div>
</div>
</section>
<section id="Create-Quantization-Sim-Model">
</div>
<div class="section" id="Create-Quantization-Sim-Model">
<h3>Create Quantization Sim Model<a class="headerlink" href="#Create-Quantization-Sim-Model" title="Permalink to this heading"></a></h3>
<p>Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - <strong>quant_scheme</strong>: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - <strong>default_activation_bw</strong>: Setting this to 8, essentially means that we are
asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - <strong>default_param_bw</strong>: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision</p>
Expand All @@ -1303,8 +1295,8 @@ <h3>Create Quantization Sim Model<a class="headerlink" href="#Create-Quantizatio
</pre></div>
</div>
</div>
</section>
<section id="Compute-Encodings">
</div>
<div class="section" id="Compute-Encodings">
<h3>Compute Encodings<a class="headerlink" href="#Compute-Encodings" title="Permalink to this heading"></a></h3>
<p>Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node.</p>
<p>For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.</p>
Expand Down Expand Up @@ -1357,9 +1349,9 @@ <h3>Compute Encodings<a class="headerlink" href="#Compute-Encodings" title="Perm
</pre></div>
</div>
</div>
</section>
</div>
<hr class="docutils" />
<section id="4.-Apply-Adaround">
<div class="section" id="4.-Apply-Adaround">
<h3>4. Apply Adaround<a class="headerlink" href="#4.-Apply-Adaround" title="Permalink to this heading"></a></h3>
<p>We can now apply AdaRound to this model.</p>
<p>Some of the parameters for AdaRound are described below</p>
Expand Down Expand Up @@ -1455,16 +1447,16 @@ <h3>4. Apply Adaround<a class="headerlink" href="#4.-Apply-Adaround" title="Perm
</pre></div>
</div>
</div>
</section>
</div>
<hr class="docutils" />
<section id="Summary">
<div class="section" id="Summary">
<h3>Summary<a class="headerlink" href="#Summary" title="Permalink to this heading"></a></h3>
<p>This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.</p>
<p>We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.</p>
<p>A few additional resources: - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques</p>
</section>
</section>
</section>
</div>
</div>
</div>


</div>
Expand Down
Loading

0 comments on commit 1516933

Please sign in to comment.