Blog

Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings

March 6, 2023

In deep learning, labelling data is time consuming and expensive. Therefore, designing useful data representations that lead to satisfactory performance on classification (or other downstream) tasks without requiring large amounts of labelled data is a crucial area of research. The few shot learning (FSL) problem aims to find suitable data representations to enable classification tasks in the presence of a low number of labelled data samples.

In our work, we approach the representation learning task by tackling the hubness problem - In high dimensional spaces, data points tend to cluster around certain exemplar points called hubs, adversely affecting the performance of nearest neighbour based FSL approaches.

Our approach, which embeds data representations uniformly on the hypersphere, while preserving local similarities in both input and embedding space, is shown to both reduce hubness and improve the performance of state of the art FSL classifiers used today in transductive few shot learning. This novel embedding technique, no-HUB and the related noHUB-S, are classifier agnostic, thus can be used on most off-the-shelf FSL classifiers in use today. Finally, the use of the hypersphere as a support for data representation indicates promising research directions for the future.

Publication

Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings

March 6, 2023

Daniel J. Trosten*, Rwiddhi Chakraborty*, Sigurd Løkse, Kristoffer Knutsen Wickstrøm, Robert Jenssen, Michael Kampffmeyer (* indicates equal contribution)

Paper abstract

Distance-based classification is frequently used in transductive few-shot learning (FSL). However, due to the high-dimensionality of image representations, FSL classifiers are prone to suffer from the hubness problem, where a few points (hubs) occur frequently in multiple nearest neighbour lists of other points. Hubness negatively impacts distance-based classification when hubs from one class appear often among the nearest neighbors of points from another class, degrading the classifier's performance. To address the hubness problem in FSL, we first prove that hubness can be eliminated by distributing representations uniformly on the hypersphere. We then propose two new approaches to embed representations on the hypersphere, which we prove optimize a tradeoff between uniformity and local similarity preservation -- reducing hubness while retaining class structure. Our experiments show that the proposed methods reduce hubness, and significantly improves transductive FSL accuracy for a wide range of classifiers.