When I intialize Faster R-CNN in the deployment phase, the number of samples per image (parameter from config file: TEST.RPN_POST_NMS_TOP_N) is set to 300, that's the number of predicted bounding boxes to keep after non-max suppression. However, the network is initialized with the number set to 1:
('rpn/output', (1, 512, 14, 14)) ('rpn/output_rpn_relu/3x3_0_split_0', (1, 512, 14, 14)) ('rpn/output_rpn_relu/3x3_0_split_1', (1, 512, 14, 14)) ('rpn_cls_score', (1, 18, 14, 14)) ('rpn_bbox_pred', (1, 36, 14, 14)) ('rpn_cls_score_reshape', (1, 2, 126, 14)) ('rpn_cls_prob', (1, 2, 126, 14)) ('rpn_cls_prob_reshape', (1, 18, 14, 14)) ('rois', (1, 5)) ('pool5', (1, 512, 7, 7)) ('fc6', (1, 4096)) ('fc7', (1, 4096)) ('fc7_relu7_0_split_0', (1, 4096)) ('fc7_relu7_0_split_1', (1, 4096)) ('cls_score', (1, 21)) ('bbox_pred', (1, 84)) ('cls_prob', (1, 21)) The ones I'm particularly interested are fc6/7, bbox_pred and cls_prob. After net.forward(**kwargs) is run, the first dimension of these layers is changed to 300: (300,4096), (300, 84), (300,21) to match the number of RoIs. The rois output is reshaped in the TargetLayer class, but rest are a bit of a problem:
Caffe doesn't implement this out of the box, there should be some wrapper for this, but I can't find it. Any suggestions on where to look? I want to implement something similar for my algorithm.
It is a bit confusing because all these four layer (fc6/7, bbox_pred, cls_prob) are just fully connected layers defined in the config, nothing fancy.
PS Also I don't think (100,4096) means 100 times more weights, it would cetrainly be undoable for such layer size, so the weights are shared. But how?