Notes On YOLO

Published: 14 Dec 2015 Category: deep_learning

How does YOLO organise training data?

ground-truth:

49 x (1 + 20 + 4) =>
49 x (1 x obj_gt + 20 x classes_gt + 4 x box_gt)

predict-data:

49 x 20 + 49x(1x2) + 49x(4x2) =>
49 x (20 x classes) + 49 x (2 x obj_confidence) + 49 x (2 x predict_boxes)

Some questions..

(1) The multi-part loss function differ from the code implementation:

while in forward_detection_layer(), detection_layer.c, different loss calculation:

*(l.cost) += pow(1-iou, 2);

*(l.cost) -= l.noobject_scale * pow(l.output[p_index], 2);
*(l.cost) += l.object_scale * pow(1-l.output[p_index], 2);

*(l.cost) += l.noobject_scale*pow(l.output[p_index], 2);

*(l.cost) += l.class_scale * pow(state.truth[truth_index+1+j] - l.output[class_index+j], 2);