On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering

December 19, 2023

The objective in deep multi-view clustering (MVC) is to discover unknown groups in data originating from multiple views or multiple modalities, using methods from deep learning.

Many recent methods for deep MVC use self-supervised learning (SSL) to learn representations that are better suited for the clustering task. However, we find that current methods for deep MVC vary greatly in the motivation, justification, and implementation – specifically for the SSL-based components.These variations make it difficult to evaluate and compare methods, and to identify promising directions for future advancements in the field.

To this end, we propose DeepMVC – a unified framework which includes many recent methods as instances. Our framework provides a consistent implementation for current and new methods, and allows accurate and rigorous comparisons between methods and their components. We also provide key insights on drawbacks of contrastive alignment, which is a popular SSL component in deep MVC. Finally, we develop several new DeepMVC instances that perform well compared to current state-of-the-art methods.


On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering

March 6, 2023

Daniel J. Trosten, Sigurd Løkse, Robert Jenssen, Michael Kampffmeyer.

Paper abstract

Self-supervised learning is a central component in recent approaches to deep multi-view clustering (MVC). However, we find large variations in the development of self-supervision-based methods for deep MVC, potentially slowing the progress of the field. To address this, we present DeepMVC, a unified framework for deep MVC that includes many recent methods as instances. We leverage our framework to make key observations about the effect of self-supervision, and in particular, drawbacks of aligning representations with contrastive learning. Further, we prove that contrastive alignment can negatively influence cluster separability, and that this effect becomes worse when the number of views increases. Motivated by our findings, we develop several new DeepMVC instances with new forms of self-supervision. We conduct extensive experiments and find that (i) in line with our theoretical findings, contrastive alignments decreases performance on datasets with many views; (ii) all methods benefit from some form of self-supervision; and (iii) our new instances outperform previous methods on several datasets. Based on our results, we suggest several promising directions for future research. To enhance the openness of the field, we provide an open-source implementation of DeepMVC, including recent models and our new instances. Our implementation includes a consistent evaluation protocol, facilitating fair and accurate evaluation of methods and components.