In this post I will explain how to add a simple rotation invariance mechanism to the BRIEF descriptor, I will present evaluation results showing the rotation invariant BRIEF significantly outperforms regular BRIEF where visual geometric changes are present and finally I will post a C++ implementation integrated into OpenCV3.
Just as a reminder, we had a general post on local image descriptors, an introductory post to binary descriptors and a post presenting the BRIEF descriptor. We also had posts on other binary descriptors: ORB, BRISK and FREAK.
We’ll start by a visual example, displaying the correct matches between a pair of images of the same scene, taken from different angles – once with the original version of BRIEF (first image pair) and one with the proposed rotation invariant version of BRIEF (second image pair):
It can be seen that there are much more correct matches when using the proposed rotation invariant of the BRIEF descriptor.
The BRIEF descriptor
The BRIEF descriptor is one the simplest of the binary descriptors and also the first published. BRIEF operates by comparing the same set of smoothed pixel pairs for each local patch that it describes. For each pair, if the first smoothed pixel’s intensity is larger than that of the second BRIEF writes 1 in the final descriptor’s string, and 0 otherwise. The sampling pairs are chosen randomly, initialized only once and used for each image and local patch. As usual, the distance between two binary descriptors is computed as the number of different bits, and can be formally written as sum(XOR(descriptor1, descriptor2)).
Adding rotation invariance
Our method for adding rotation invariance is straightforward and uses the detector coupled with the descriptor. Many keypoint detectors can estimate the patch’s orientation (e.g. SIFT and SURF) and we can make use of that estimate to properly align the sampling pairs. For each patch, given the angle of the patch, we can rotate the sampling pairs according to the patch’s orientation and thus extract rotation invariant descriptors. The same principle is applied in the original implementation of the other rotation invariant binary descriptors (ORB, BRISK and FREAK), but as opposed to them we just take the orientation of the patch from the keypoint detector instead of devising some orientation measurement mechanism.
Now for the fun part – comparing rotation invariance BRIEF with BRIEF’s original version. I’ll also compare to SIFT to see how binary descriptors compete with some of the floating point descriptors.
For the evaluation, I’ll use the Mikolajczyk benchmark  which is a publicly available and standard benchmark for evaluating local descriptors. The benchmark consists of 8 image sets, each containing 6 images that depict an increasing degree of a certain image transformation. Each set depicts a different transformation:
- Bark – zoom + rotation changes.
- Bikes – blur changes.
- Boat – zoom + rotation changes.
- Graffiti – view point changes.
- Leuven – illumination changes.
- Trees – blur changes.
- UBC – JPEG compression
- Wall – view point changes.
Below are the images of each set in the benchmark. In each set, the images are ordered from left to right and top to bottom (the first row contains images 1-3, the second row contains images 4-6).
Bark (zoom and rotation changes):
Bikes (blur): you can notice that image 6 is far more blurred than image 1.
Boat (zoom and rotation changes):
Graffiti (view point changes):
Leuven (illumination changes):
UBC (JPEG compression):
Wall (viewpoint changes):
The protocol for the benchmark is the following: in each set, we detect keypoints and extract descriptors from each of the images, compare the first image to each of the remaining five images and check for correspondences. The benchmark includes known ground truth transformations (homographies) between the images, thus we can compute the percent of the correct matches and display the performance of each descriptor using recall vs. 1-precision curves.
I used the public OpenCV implementation for our experiments. SIFT is used as a keypoint detector and I used the 512 bits version of BRIEF and rotation invariant BRIEF.
Below are tables summarizing the area under the recall vs. precision curve for each of the sets, averaged over the five image pairs – higher values means the descriptor performs better. For clarity, I also specified the type of image transformation introduced by each set.
|Descriptor||Bark (zoom + rotation)||Bikes (blur)||Boat (zoom + rotation)||Graffiti (view point changes)|
|Rotation Invariant BRIEF||0.055||0.353||0.05||0.103|
|Descriptor||Leuven (illumination)||Trees (blur)||UBC (JPEG compression)||Wall (view point changes)|
|Rotation Invariant BRIEF||0.228||0.061||0.178||0.146|
Notice that for sets that depict orientation changes (Bark and Boat), the rotation invariant version of BRIEF performs much better than the original (not invariant) version. However, in sets that depict photometric changes (blur, illumination and JPEG compression) and do not depict orientation changes, the original version of BRIEF performs better than the rotation invariant one. It seems that when orientation changes are not present, trying to compensate for them introduces noise and reduces performance. Notice also that since the set Graffiti introduces some orientation changes (as can be seen from the images above), the rotation invariant version of BRIEF has an advantage over the original version of BRIEF. One can also see that although the “Wall” set exhibit view point changes, the images in the set have very much the same orientation, thus the rotation invariant version of BRIEF performs worse than the original one. On a side note, it is also very interesting to see that in some of the sets, BRIEF and Rotation Invariant BRIEF even outperform the SIFT descriptor (keep in mind that BRIEF is a lot faster to extract and match and also take much less storage space).
To further illustrate the difference in performance between the original and the rotation invariant version of BRIEF, below are recall vs. 1-precision curves for the sets Bikes, Graffiti and Boat, respectively.
Notice again that BRIEF outperforms it’s rotation invariant version in the Bikes sets, which depicts photometric changes (specifically, blur) while the rotation invariant version of BRIEF outperforms the original version in the sets Graffiti and Boat which depict rotation changes.
Adding Rotation Invariant BRIEF to OpenCV
I’m in the process of contributing an implementation of the rotation invariant version of BRIEF to OpenCV. I’ve forked the GitHub OpenCV3.0 repository and implemented the changes under my forked repository.
The code has been further cleaned and is now available under the following pull request: https://github.com/Itseez/opencv_contrib/pull/207
I have presented a rotation invariant version of BRIEF that makes use of the detector’s estimation of the keypoint orientation in order to align the sampling point of the BRIEF descriptor, thus making it rotation invariant. I’ve demonstrated the advantage of the rotation invariant version of BRIEF in scenarios where orientation changes are present and also it’s disadvantage in dealing with photometric changes (blur, lightning and JPEG compression). Finally, I’ve published a C++ implementation of the proposed descriptor integrating it into OpenCV3.
 Calonder, Michael, et al. “Brief: Binary robust independent elementary features.” Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 778-792.
 Rublee, Ethan, et al. “ORB: an efficient alternative to SIFT or SURF.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
 Leutenegger, Stefan, Margarita Chli, and Roland Yves Siegwart. “BRISK: Binary robust invariant scalable keypoints.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
 Alahi, Alexandre, Raphael Ortiz, and Pierre Vandergheynst. “Freak: Fast retina keypoint.” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. Ieee, 2012.
 Lowe, David G. “Distinctive image features from scale-invariant keypoints.”International journal of computer vision 60.2 (2004): 91-110.
 Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. “Surf: Speeded up robust features.” Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 404-417.
 Rosten, Edward, and Tom Drummond. “Machine learning for high-speed corner detection.” Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 430-443.
 Mikolajczyk, Krystian, and Cordelia Schmid. “A performance evaluation of local descriptors.” Pattern Analysis and Machine Intelligence, IEEE Transactions on27.10 (2005): 1615-1630.