Fast RCNN

February 07, 2020

The idea is straightforward.
1)Instead of passing all regions into the convolutional layer one by one, we pass the entire image once and produce a feature map.

2)Then we take the region proposals as before (using some external method) and sort of project them onto the feature map.

3)Now we have the regions in feature map instead of the original image and we can forward them in some fully connected layers to output the classification decision and the bounding box correction.

Note that the projection of regions proposal is implemented using a special layer(ROI pooling layer), which is essentially a type of max-pooling with a pool size dependent on the input, so that the output always has the same size.

Search This Blog

Sophie's Daily Note

Fast RCNN

Comments

Post a Comment

Popular posts from this blog

Python Notes

Clustering vs Dimensionality Reduction

Java Learning