In the last few posts we introduced the concept of local image descriptors and specifically binary image descriptors. We surveyed notable example of binary descriptors, namely BRIEF, ORB, BRISK  and FREAK. Here, we will both introduce a novel binary descriptor that we have developed and give a full evaluation of several binary and floating point descriptors. We will show that our proposed descriptor – the LATCH descriptor – outperforms the alternatives with similar running times. We will also demonstrate its performance in the real world application of 3D reconstruction from multiple images.
Given an image patch centered around a keypoint, LATCH compares the intensity of three pixel patches in order to produce a single bit in the final binary string representing the patch. Example triplets are drawn over the patch in green and blue
Our proposed LATCH descriptor was presented in the following paper:
Gil Levi and Tal Hassner, LATCH: Learned Arrangements of Three Patch Codes, IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, March, 2016
Here is a short video of me presenting LATCH at the WACV 16 conference (I apologize for the technical problems in the video).
Our LATCH descriptor has already been officially integrated into OpenCV3.0 and has even won the CVPR 2015, OpenCV State of the Art Vision Challenge, in the Image Registration category !
Also, see CUDA (GPU) implementation of the LATCH descriptor and a cool visual odometry demo, both by Christopher Parker.
For more information, please see LATCH project page.
In this post I will explain how to add a simple rotation invariance mechanism to the BRIEF descriptor, I will present evaluation results showing the rotation invariant BRIEF significantly outperforms regular BRIEF where visual geometric changes are present and finally I will post a C++ implementation integrated into OpenCV3.
Just as a reminder, we had a general post on local image descriptors, an introductory post to binary descriptors and a post presenting the BRIEF descriptor. We also had posts on other binary descriptors: ORB, BRISK and FREAK.
We’ll start by a visual example, displaying the correct matches between a pair of images of the same scene, taken from different angles – once with the original version of BRIEF (first image pair) and one with the proposed rotation invariant version of BRIEF (second image pair):
Correct matches when using the BRIEF descriptor
Correct matches when using the rotation invariant BRIEF descriptor
It can be seen that there are much more correct matches when using the proposed rotation invariant of the BRIEF descriptor.
We’ll start by showing the following figure that shows an example of using BRISK to match between real world images with viewpoint change. Green lines are valid matches, red circles are detected keypoints.
BRISK descriptor – example of matching points using BRISK
We’ll start by showing the following figure that shows an example of using ORB to match between real world images with viewpoint change. Green lines are valid matches, red circles indicate unmatched points.
ORB descriptor – An example of keypoints matching using ORB
Following the previous posts that provided both an introduction to patch descriptors in general and specifically to binary descriptors, it’s time to talk about the individual binary descriptors in more depth. This post will talk about the BRIEF descriptor and the following post will talk about ORB, BRISK and FREAK.
Why Binary Descriptors?
Following the previous post on descriptors, we’re now familiar with histogram of gradients (HOG) based patch descriptors. SIFT, SURF and GLOH have been around since 1999 and been used successfully in various applications, including image alignment, 3D reconstruction and object recognition. On the practicle side, OpenCV includes implementations of SIFT and SURF and Matlab packages are also available (check vlfeat for SIFT and extractFeatures in Matlab computer vision toolbox for SURF).
BRISK descriptor – sampling pairs
So, if there no question about SIFT and SURF performance, why not use them in every application?
Since the next few posts will talk about binary descriptors, I thought it would be a good idea to post a short introduction to the subject of patch descriptors. The following post will talk about the motivation to patch descriptors, the common usage and highlight the Histogram of Oriented Gradients (HOG) based descriptors.
I think the best way to start is to consider one application of patch descriptors and to explain the common pipeline in their usage. Consider, for example, the application of image alignment: we would like to align two images of the same scene taken at slightly different viewpoints. One way of doing so is by applying the following steps:
Compute distinctive keypoints in both images (for example, corners).
Compare the keypoints between the two images to find matches.
Use the matches to find a general mapping between the images (for example, a homography).
Apply the mapping on the first image to align it to the second image.
Using descriptors to compare patches