Abstract
Image matching is a fundamental problem in Computer Vision. In the context of feature-based matching, SIFT and its variants have long excelled in a wide array of applications. However, for ultra-wide baselines, as in the case of aerial images captured under large camera rotations, the appearance variation goes beyond the reach of SIFT and RANSAC. In this paper we propose a data-driven, deep learning-based approach that sidesteps local correspondence by framing the problem as a classification task. Furthermore, we demonstrate that local correspondences can still be useful. To do so we incorporate an attention mechanism to produce a set of probable matches, which allows us to further increase performance. We train our models on a dataset of urban aerial imagery consisting of 'same' and 'different' pairs, collected for this purpose, and characterize the problem via a human study with annotations from Amazon Mechanical Turk. We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching and approach human accuracy.
| Originalsprog | Engelsk |
|---|---|
| Tidsskrift | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
| Sider (fra-til) | 3539-3547 |
| Antal sider | 9 |
| ISSN | 1063-6919 |
| DOI | |
| Status | Udgivet - 9 dec. 2016 |
| Udgivet eksternt | Ja |
| Begivenhed | 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, USA Varighed: 26 jun. 2016 → 1 jul. 2016 |
Konference
| Konference | 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 |
|---|---|
| Land/Område | USA |
| By | Las Vegas |
| Periode | 26/06/2016 → 01/07/2016 |
Bibliografisk note
Publisher Copyright:© 2016 IEEE.
Citationsformater
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS