Region network proposals with Faster R-CNN

Deep Learning for Images with PyTorch

Michal Oleszak

Machine Learning Engineer

Regions and anchor boxes

Region: a smaller area of the image that could contain objects of interest, grouped by visual characteristics

region proposals

Regions and anchor boxes

Region: a smaller area of the image that could contain objects of interest, grouped by visual characteristics

region proposals

Anchor box: predefined bounding box templates of different sizes and shapes

Faster R-CNN model

Faster R-CNN: an advanced version of R-CNN

rcnn layers

Backbone (convolutional layers)

¹ Edward Raff. 2022. Inside Deep Learning.

Faster R-CNN model

Faster R-CNN: an advanced version of R-CNN rcnn layers

Backbone (convolutional layers)
Region proposal network (RPN) for bounding box proposals

¹ Edward Raff. 2022. Inside Deep Learning.

Faster R-CNN model

Faster R-CNN: an advanced version of R-CNN

rcnn layers

Convolution layers (backbone): feature maps
Region proposal network (RPN): bounding box proposals
Classifier and regressor to produce predictions

¹ Edward Raff. 2022. Inside Deep Learning.

Region proposal network (RPN)

region proposal network architecture

Region proposal network (RPN)

region proposal network architecture

Anchor generator:
- Generate a set of anchor boxes of different sizes and aspect ratios

Region proposal network (RPN)

region proposal network architecture

Anchor generator:
- Generate a set of anchor boxes of different sizes and aspect ratios
Classifier and regressor:
- Predict if the box contains an object and provide coordinates

Region proposal network (RPN)

region proposal network architecture

Anchor generator:
- Generate a set of anchor boxes of different sizes and aspect ratios
Classifier and regressor:
- Predict if the box contains an object and provide coordinates
Region of interest (RoI) pooling:
- Resize the RPN proposal to a fixed size for fully connected layers

RPN in PyTorch

from torchvision.models.detection.rpn import AnchorGenerator


anchor_generator = AnchorGenerator(
    sizes=((32, 64, 128),),
    aspect_ratios=((0.5, 1.0, 2.0),),
)

from torchvision.ops import MultiScaleRoIAlign


roi_pooler = MultiScaleRoIAlign(
    featmap_names=["0"],
    output_size=7,
    sampling_ratio=2,
)

Fast R-CNN loss functions

RPN classification loss:
- region contains object or not
- binary cross-entropy
- rpn_cls_criterion = nn.BCEWithLogitsLoss()

RPN box regression loss:
- bounding box coordinates
- mean squared error
- rpn_reg_criterion = nn.MSELoss()

R-CNN classification loss:
- multiple object classes
- cross-entropy
- rcnn_cls_criterion = nn.CrossEntropyLoss()

R-CNN box regression loss:
- bounding box coordinates
- mean squared error
- rcnn_reg_criterion = nn.MSELoss()

Faster R-CNN in PyTorch

from torchvision.models.detection import FasterRCNN


backbone = torchvision.models.mobilenet_v2(weights="DEFAULT").features

backbone.out_channels = 1280


model = FasterRCNN(
       backbone=backbone,
       num_classes=num_classes,
       rpn_anchor_generator=anchor_generator,
       box_roi_pool=roi_pooler,
)

Faster R-CNN in PyTorch

Load pre-trained Faster R-CNN

from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights="DEFAULT")

Define number of classes and classifier input sise

num_classes = 2

in_features = model.roi_heads.box_predictor.cls_score.in_features

Replace model's classifier with a one with the desired number of classes

model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

Let's practice!

Deep Learning for Images with PyTorch