Eirik Østmo / Torger Grytå

VI seminar #28 - Self-supervised Vision Transformers for Land-cover Segmentation and Classification

Presenters: Linus Scheibenrei and Joëlle Hanna, University of St. Gallen (Switzerland)

Linus Scheibenrei
Jöelle Hanna

Transformer models have recently approached or even surpassed the performance of ConvNets on computer vision tasks like classification and segmentation with large scale supervised pre-training. In this work, we bridge the gap between ConvNets and Transformers for Earth observation by self-supervised pre-training on large-scale unlabeled remote sensing data. The resulting representations can be utilized for both land cover classification and segmentation tasks, where they significantly outperform the fully supervised baselines and require only a fraction of the labeled training data.

