Deep Learning for Images with PyTorch
Michal Oleszak
Machine Learning Engineer
bbox1 = [50, 50, 150, 150] bbox2 = [100, 100, 200, 200]
bbox1 = torch.tensor(bbox1).unsqueeze(0) bbox2 = torch.tensor(bbox2).unsqueeze(0)
from torchvision.ops import box_iou
iou = box_iou(bbox1, bbox2)
print(iou)
tensor([[0.1429]])
model.eval() with torch.no_grad():
output = model(input_image)
print(output)
[{'boxes': tensor([[ 42.8553, 271.9481, 180.6003, 346.7082],
[191.6016, 80.4759, 247.8009, 387.5475], ....),
'scores': tensor([1.0000, 1.0000, 0.9998, ... ]),
'labels': tensor([18, 1, 20, 18, 18, 18 ...])
}]
boxes = output[0]["boxes"]
scores = output[0]["scores"]
Non-max suppression: a common technique to select the most relevant bounding boxes
Non-max: discarding boxes with low confidence score to contain an object
Suppression: discarding boxes with low IoU
from torchvision.ops import nms
box_indices = nms( boxes=boxes, scores=scores, iou_threshold=0.5, ) print(box_indices)
tensor([ 0, 1, 2, 8])
filtered_boxes = boxes[box_indices]
Boxes: tensors with the bounding box coordinates of the shape [N, 4]
Scores: tensor with the confidence score for each box of the shape [N]
iou_threshold: the threshold between 0.0 and 1.0
Output: indices of filtered bounding boxes
Deep Learning for Images with PyTorch