Implementation YOLOv3 using PyTorch

YOLOv3 with CIoU loss 💡

In this version, convergence has been improved due to the implementation of CIoU loss for bounding box regression. The main part of the code has been rewritten and optimized. Added the ability to select the image resolution for training

Now you can observe the training process, having an informative graph for each loss.

This release v2.0 is recommended for all trainings ✅

Dataset

This repository was train on the African Wildlife Dataset from Kaggle
The data folder must contain the train and test folders as follows:

├── data 
  ├── train
    ├── class1
      ├── 001.jpg
      ├── 001.txt(xml)
  ├── test 
    ├── class1
      ├── 001.jpg
      ├── 001.txt(xml)
  ├── obj.data
  ├── obj.names

Also, for training, in the data folder there must be obj.data file with some settings

classes = 4
train = data/train
valid = data/test
names = data/obj.names
backup = backup/
file_format = txt
convert_to_yolo = False

And there must be obj.names file with label names.

0 buffalo
1 elephant
2 rhino
3 zebra

In the description files, you can write the coordinates of the bounding boxes in a simple format (x1, y1, x2, y2). After this use the appropriate flag when training. YOLO format is also available and recommended. Format files as follows:

txt

<class> <xmin> <ymin> <xmax> <ymax>

example:

1 207 214 367 487

txt (already converted to yolo)

<class> <xcenter> <ycenter> <width> <height>

example:

1 0.2 0.3 0.15 0.23

xml

example:

<annotation>
<object>
	<name>zebra</name>
	<bndbox>
		<xmin>71</xmin>
		<ymin>60</ymin>
		<xmax>175</xmax>
		<ymax>164</ymax>
	</bndbox>

Clustering

Before training, you should extract clusters of anchor boxes based on k-means clustering:

python3 clusterAnalysis.py --path ./data/obj.data --n_cluster 9 --gen 100

All clustering parameters:

--path (states: path to obj.data)

--n_cluster (states: clusters number)

--gen (states: generations number)

Note: Set the results obtained below to the anchors variable in train.py and detect.py

average IOU: 0.983
anchor boxes:
 [[ 76.65064664 107.10037741]
 [168.95037498 172.72056937]
 [134.74968954 247.5132047 ]
 [219.54969625 240.28065663]
 [167.14672785 327.9973526 ]
 [261.83260957 280.90816227]
 [333.50469046 293.58168127]
 [285.5608964  381.0593046 ]
 [384.12271692 380.82027335]]

Training

Moving on to training

python3 train.py --epochs 100 --lr 0.003

All training parameters:

--image size (states: input image size)

--epochs (states: total epochs)

--lr (states: learning rate)

--batch_size (states: batch size)

--weight_decay (states: weight decay)

--weights (states: path to yolo PyTorch weights or path to Darknet53 binary weights if you want to train your model form scratch)

--multiscale_off (states: disable multi-scale training)

--verbose (states: show all losses and resolution changes)

After training, mAP will be calculated on the train dataloader and the test dataloader.

Note: You can change the thresholds in train.py.

Inference

On video:

python3 detect.py --image_size 416 --video --data_test content/video.mp4 --output content/detect.mp4 --weights backup/yolov3.pt

On image:

python3 detect.py --image_size 416 --data_test content/image.jpg --output content/detect.jpg --weights backup/yolov3.pt

Additional parameters:

--show (states: show frames during inference)

Note: You can change the thresholds in detect.py.

Comparison

Model	Dataset	Input size _(pixel)	mAP _(@0.5)
YOLOv1 _(Ours⭐)	African Wildlife	448	61
YOLOv2 _(Ours⭐)	African Wildlife	416	72
YOLOv3 _(Ours⭐)	African Wildlife	416	77
YOLOv3 CIoU loss _(Ours⭐)	African Wildlife	416	82

Test API ⚙

from torch.utils.data import DataLoader
from dataloader.dataset import Dataset
from model.yolo import YOLOv3
from utils.utils import get_bound_boxes
from utils.mAP import mean_average_precision

anchors = [[[313, 303], [336, 323], [306, 371]],
           [[139, 222], [149, 212], [171, 205]],
           [[23, 34], [99, 234], [129, 224]]]

# Creating custom dataloaders for validation
# You can see an example of creation in 'train.py'
# ...

# Creating model and loading pretrained weights
model = YOLOv3(anchors=anchors, num_classes=classes).to(device)
model.load_state_dict(torch.load('path to weights'))  # Here you have to write the path to weights

# Getting prediction boxes and true boxes
# nms_threshold - threshold for non-maximum suppresion
# threshold - threshold in model prediction
pred_boxes, true_boxes = get_bound_boxes(dataloader, model, anchors, image_size=416, nms_threshold=0.5, threshold=0.3, device=device)

# As a result, we need to calculate mAP
mAP = mean_average_precision(pred_boxes, true_boxes, classes=classes, iou_threshold=0.5)
print(mAP)

Dependencies

PyTorch

Version: 1.13.1

Albumentations

Version: 1.3.0

OpenCV

Version: 4.7.0

NumPy

Version: 1.23.0

xmltodict

Version: 0.13.0

terminaltables

Version: 3.1.10

References

Original YOLOv3 paper

Darknet53 full weights from ImageNet (recommended for all trainings)
Darknet53 convolutional weights from ImageNet

Contact

Developer: Alexey Serzhantov
Email: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation YOLOv3 using PyTorch

YOLOv3 with CIoU loss 💡

Dataset

Clustering

Training

Inference

Comparison

Test API ⚙

Dependencies

References

Contact

About

Releases 2

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
content		content
data		data
dataloader		dataloader
model		model
utils		utils
README.md		README.md
clusterAnalysis.py		clusterAnalysis.py
detect.py		detect.py
train.py		train.py

AlexeyDate/YOLOv3

Folders and files

Latest commit

History

Repository files navigation

Implementation YOLOv3 using PyTorch

YOLOv3 with CIoU loss 💡

Dataset

Clustering

Training

Inference

Comparison

Test API ⚙

Dependencies

References

Contact

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages