Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix args at CosineLR, TanhLR schedulers #34

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 5 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,3 @@
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**

- [Pytorch Image Models (timm)](#pytorch-image-models-timm)
- [Install](#install)
- [How to use](#how-to-use)
- [Create a model](#create-a-model)
- [List Models with Pretrained Weights](#list-models-with-pretrained-weights)
- [Search for model architectures by Wildcard](#search-for-model-architectures-by-wildcard)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

# Pytorch Image Models (timm)
> `timm` is a deep-learning library created by <a href='https://twitter.com/wightmanr'>Ross Wightman</a> and is a collection of SOTA computer vision models, layers, utilities, optimizers, schedulers, data-loaders, augmentations and also training/validating scripts with ability to reproduce ImageNet training results.

Expand All @@ -32,7 +19,7 @@ cd pytorch-image-models && pip install -e .

### Create a model

```python
```
import timm
import torch

Expand All @@ -45,7 +32,7 @@ It is that simple to create a model using `timm`. The `create_model` function is

To create a pretrained model, simply pass in `pretrained=True`.

```python
```
pretrained_resnet_34 = timm.create_model('resnet34', pretrained=True)
```

Expand All @@ -54,7 +41,7 @@ pretrained_resnet_34 = timm.create_model('resnet34', pretrained=True)

To create a model with a custom number of classes, simply pass in `num_classes=<number_of_classes>`.

```python
```
import timm
import torch

Expand All @@ -75,7 +62,7 @@ model(x).shape

`timm.list_models()` returns a complete list of available models in `timm`. To have a look at a complete list of pretrained models, pass in `pretrained=True` in `list_models`.

```python
```
avail_pretrained_models = timm.list_models(pretrained=True)
len(avail_pretrained_models), avail_pretrained_models[:5]
```
Expand All @@ -98,7 +85,7 @@ There are a total of **271** models with pretrained weights currently available

It is also possible to search for model architectures using Wildcard as below:

```python
```
all_densenet_models = timm.list_models('*densenet*')
all_densenet_models
```
Expand Down
29 changes: 12 additions & 17 deletions docs/SGDR.html

Large diffs are not rendered by default.

11 changes: 5 additions & 6 deletions docs/mixup_cutmix.html
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,7 @@ <h2 id="Training-Neural-Networks-with-Mixup/Cutmix-Augmentations">Training Neura
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>The various training arguments that are of interest when applying <code>Mixup</code>/<code>CutMix</code> data augmentations are:</p>

<pre><code>markdown
--mixup MIXUP mixup alpha, mixup enabled if &gt; 0. (default: 0.)
<div class="highlight"><pre><span></span>--mixup MIXUP mixup alpha, mixup enabled if &gt; 0. (default: 0.)
--cutmix CUTMIX cutmix alpha, cutmix enabled if &gt; 0. (default: 0.)
--cutmix-minmax CUTMIX_MINMAX [CUTMIX_MINMAX ...]
cutmix min/max ratio, overrides alpha and enables
Expand All @@ -60,9 +58,10 @@ <h2 id="Training-Neural-Networks-with-Mixup/Cutmix-Augmentations">Training Neura
Probability of switching to cutmix when both mixup and
cutmix enabled
--mixup-mode MIXUP_MODE
How to apply mixup/cutmix params. Per "batch", "pair",
or "elem"
--mixup-off-epoch N Turn off mixup after this epoch, disabled if 0. (default: 0.)</code></pre>
How to apply mixup/cutmix params. Per &quot;batch&quot;, &quot;pair&quot;,
or &quot;elem&quot;
--mixup-off-epoch N Turn off mixup after this epoch, disabled if 0. (default: 0.)
</pre></div>

</div>
</div>
Expand Down
52 changes: 22 additions & 30 deletions docs/tanh.html

Large diffs are not rendered by default.

16 changes: 7 additions & 9 deletions docs/training.html
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ <h3 id="Augmix">Augmix<a class="anchor-link" href="#Augmix"> </a></h3>
<div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">train</span><span class="o">.</span><span class="n">py</span> <span class="o">./</span><span class="n">imagenette2</span><span class="o">-</span><span class="mi">320</span> <span class="o">--</span><span class="n">aug</span><span class="o">-</span><span class="n">splits</span> <span class="mi">3</span> <span class="o">--</span><span class="n">jsd</span>
</pre></div>
<p><code>timm</code> also supports augmix with <code>RandAugment</code> and <code>AutoAugment</code> like so:</p>
<div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">train</span><span class="o">.</span><span class="n">py</span> <span class="o">./</span><span class="n">imagenette2</span><span class="o">-</span><span class="mi">320</span> <span class="o">--</span><span class="n">aug</span><span class="o">-</span><span class="n">splits</span> <span class="mi">3</span> <span class="o">--</span><span class="n">jsd</span> <span class="o">--</span><span class="n">aa</span> <span class="n">rand</span><span class="o">-</span><span class="n">m9</span><span class="o">-</span><span class="n">mstd0</span><span class="o">.</span><span class="mi">5</span><span class="o">-</span><span class="n">inc1</span>
<div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">train</span><span class="o">.</span><span class="n">py</span> <span class="o">./</span><span class="n">imagenette2</span><span class="o">-</span><span class="mi">320</span> <span class="o">--</span><span class="n">aug</span><span class="o">-</span><span class="n">splits</span> <span class="mi">3</span> <span class="o">--</span><span class="n">jsd</span> <span class="o">--</span><span class="n">aa</span> <span class="n">rand</span><span class="o">-</span><span class="n">m9</span><span class="o">-</span><span class="n">mstd0</span><span class="mf">.5</span><span class="o">-</span><span class="n">inc1</span>
</pre></div>

</div>
Expand Down Expand Up @@ -266,19 +266,18 @@ <h3 id="Auxiliary-Batch-Norm/-SplitBatchNorm">Auxiliary Batch Norm/ <code>SplitB
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>From the paper,</p>
<div class="highlight"><pre><span></span>Batch normalization serves as an essential component for many state-of-the-art computer vision models. Specifically, BN normalizes input features by the mean and variance computed within each mini-batch. <span class="gs">**One intrinsic assumption of utilizing BN is that the input features should come from a single or similar distributions.**</span> This normalization behavior could be problematic if the mini-batch contains data from different distributions, there- fore resulting in inaccurate statistics estimation.

<pre><code>markdown
Batch normalization serves as an essential component for many state-of-the-art computer vision models. Specifically, BN normalizes input features by the mean and variance computed within each mini-batch. **One intrinsic assumption of utilizing BN is that the input features should come from a single or similar distributions.** This normalization behavior could be problematic if the mini-batch contains data from different distributions, there- fore resulting in inaccurate statistics estimation.

To disentangle this mixture distribution into two simpler ones respectively for the clean and adversarial images, we hereby propose an auxiliary BN to guarantee its normalization statistics are exclusively preformed on the adversarial examples.</code></pre>
To disentangle this mixture distribution into two simpler ones respectively for the clean and adversarial images, we hereby propose an auxiliary BN to guarantee its normalization statistics are exclusively preformed on the adversarial examples.
</pre></div>

</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>To enable split batch norm,</p>
<div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">train</span><span class="o">.</span><span class="n">py</span> <span class="o">./</span><span class="n">imagenette2</span><span class="o">-</span><span class="mi">320</span> <span class="o">--</span><span class="n">aug</span><span class="o">-</span><span class="n">splits</span> <span class="mi">3</span> <span class="o">--</span><span class="n">aa</span> <span class="n">rand</span><span class="o">-</span><span class="n">m9</span><span class="o">-</span><span class="n">mstd0</span><span class="o">.</span><span class="mi">5</span><span class="o">-</span><span class="n">inc1</span> <span class="o">--</span><span class="n">split</span><span class="o">-</span><span class="n">bn</span>
<div class="highlight"><pre><span></span><span class="n">python</span> <span class="n">train</span><span class="o">.</span><span class="n">py</span> <span class="o">./</span><span class="n">imagenette2</span><span class="o">-</span><span class="mi">320</span> <span class="o">--</span><span class="n">aug</span><span class="o">-</span><span class="n">splits</span> <span class="mi">3</span> <span class="o">--</span><span class="n">aa</span> <span class="n">rand</span><span class="o">-</span><span class="n">m9</span><span class="o">-</span><span class="n">mstd0</span><span class="mf">.5</span><span class="o">-</span><span class="n">inc1</span> <span class="o">--</span><span class="n">split</span><span class="o">-</span><span class="n">bn</span>
</pre></div>

</div>
Expand All @@ -300,9 +299,8 @@ <h3 id="Synchronized-Batch-Norm">Synchronized Batch Norm<a class="anchor-link" h
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Synchronized batch norm is only used when training on multiple GPUs. From <a href="https://paperswithcode.com/method/syncbn">papers with code</a>:</p>

<pre><code>markdown
Synchronized Batch Normalization (SyncBN) is a type of batch normalization used for multi-GPU training. Standard batch normalization only normalizes the data within each device (GPU). SyncBN normalizes the input within the whole mini-batch.</code></pre>
<div class="highlight"><pre><span></span>Synchronized Batch Normalization (SyncBN) is a type of batch normalization used for multi-GPU training. Standard batch normalization only normalizes the data within each device (GPU). SyncBN normalizes the input within the whole mini-batch.
</pre></div>

</div>
</div>
Expand Down
30 changes: 16 additions & 14 deletions nbs/07b_SGDR.ipynb

Large diffs are not rendered by default.

64 changes: 32 additions & 32 deletions nbs/07d_tanh.ipynb

Large diffs are not rendered by default.