Face Detection | DSFD
1 THREE PROBLEMS
Feature learning Feature Pyramid Network (FPN) 优点:aggregates hierarchical feature maps between high and low-level output layers, 缺点:does not consider the current layer’s information, and the context relationship between anchors is ignored.
Loss design a regression loss for the face region + a classification loss for identifying if a face is detected or not. Class imbalance problem:Focal Loss / Hierarchical Loss. 缺点:do not consider progressive learning ability of feature maps in both of different levels and shots.
Anchor matching Basically, pre-set anchors for each feature map are generated by regularly tiling a collection of boxes with different scales and aspect ratios on the image. Some works [27, 39] analyze a series of reasonable anchor scales and anchor compensation strategy to increase positive anchors. However, such strategy ignores random sampling in data augmentation, which still causes imbalance between positive and negative anchors.
2 THREE CONTRIBUTIONS
Feature Enhance Module (FEM) :enhance the original feature maps to extend the single shot detector to dual shot detector.
Feature Pyramid Network (FPN) in PyramidBox +Receptive Field Block (RFB) in RFBNet
Progressive Anchor Loss (PAL) :computed by two different sets of anchors to effectively facilitate the features.
smaller anchor sizes in the first shot, and use larger sizes in the second shot
Improved Anchor Matching (IAM) :integrates anchor partition strategy and anchor-based data augmentation to better match anchors and ground truth faces, and thus provides better initialization for the regressor.
adaptively choose different anchor sizes in different stages to facilitate the features.
3 DUAL SHOT FACE DETECTOR
3.1 Pipeline of DSFD
3.2 Feature Enhance Module
FEM用来强化original features,使它们变得更discriminable和robust. 为了强化original neuron cell oc_(i,j,l), FEM利用了不同维度的信息,包括upper layer original neuron cell oc(i,j,l)(上层的自己), 和当前layer non-local neuron cells(邻居们)。
3.3 Progressive Anchor Loss
Second Shot Loss
第一项表示是否是脸:softmax loss over two classes
第二项表示位置是否正确:the smooth L1 loss between the parameterizations of the predicted box ti and ground-truth box gi using the anchor ai
First Shot Loss(负责小脸)
总Loss
3.4. Improved Anchor Matching
目的是解决离散的anchor scales和连续的face scales之间的矛盾,face通过S_input * S_face / S_anchor来增强with 40%概率,以增加positive anchors,稳定训练。
Anchor Design ???? each feature map cell -> fixed shape anchor
问题:为啥图size越小,anchor size越大?
2/5概率:利用anchor-based 采样(pyramidBox)
3/5概率:SSD数据增强
IoU 0.4
4 Experiments
4.1 implementation details
backbone networks:pertained VGG/ResNet 新增的conv层用Xavier初始化参数
4.2 Analysis on DSFD
1) Feature enhance is crucial. We use a more robust and discriminative feature enhance module to improve the feature presentation ability, especially for hard face.
2) Auxiliary loss based on progressive anchor is used to train all 12 different scale detection feature maps, and it improves the performance on easy, medium and hard faces simultaneously.
3) Our improved anchor matching provides better initial anchors and ground-truth faces to regress anchor from faces, which achieves the improvements of 0.3%, 0.1%, 0.3% on three settings, respectively. Additionally, when we enlarge the training batch size (i.e., LargeBS), the result in hard setting can get 91.2% AP.
推荐阅读
-
orthographic feature transform for monocular 3d object detection github
-
iPad 3」はフロントに「Face Time HD」カメラを採用し3月発売か紹介
-
Face Detection | DSFD
-
物体检测的特征金字塔网络(Feature Pyramid Networks for Object Detection)
-
FPN:feature pyramid networks for object detection
-
目标检测(object detection)系列(十) FPN:用特征金字塔引入多尺度
-
【论文笔记】FPN - Feature Pyramid Networks for Object Detection
-
特征金字塔与自注意力结合:一种提升显著物体检测效果的新型方法(Salient Object Detection via Fusion of Self-Attention and Feature Pyramid Network)
-
使用opencv python导入tensorflow训练的Object Detection模型并进行预测
-
斯坦福大学NLP课程之七:情绪理解(Feelings Detection)