January 21, 2026
October 12, 2025
Teresa Dorszewski, Lenka Tětková, Robert Jenssen, Lars Kai Hansen, Kristoffer Knutsen Wickstrøm
Vision Transformers (ViTs) are increasingly utilized in var-ious computer vision tasks due to their powerful representation capabil-ities. However, it remains understudied how ViTs process informationlayer by layer. Numerous studies have shown that convolutional neuralnetworks (CNNs) extract features of increasing complexity throughouttheir layers, which is crucial for tasks like domain adaptation and transferlearning. ViTs, lacking the same inductive biases as CNNs, can poten-tially learn global dependencies from the first layers due to their atten-tion mechanisms. Given the increasing importance of ViTs in computervision, there is a need to improve the layer-wise understanding of ViTs.In this work, we present a novel, layer-wise analysis of concepts encodedin state-of-the-art ViTs using neuron labeling. Our findings reveal thatViTs encode concepts with increasing complexity throughout the net-work. Early layers primarily encode basic features such as colors and tex-tures, while later layers represent more specific classes, including objectsand animals. As the complexity of encoded concepts increases, the num-ber of concepts represented in each layer also rises, reflecting a morediverse and specific set of features. Additionally, different pretrainingstrategies influence the quantity and category of encoded concepts, withfinetuning to specific downstream tasks generally reducing the number ofencoded concepts and shifting the concepts to more relevant categories.
From Colors to Classes: Emergence of Concepts in Vision Transformers
Teresa Dorszewski, Lenka Tětková, Robert Jenssen, Lars Kai Hansen, Kristoffer Knutsen Wickstrøm
Communications in Computer and Information Science, vol 2576. Springer 2025
October 12, 2025


Teresa Dorszewski, Lenka Tětková, Robert Jenssen, Lars Kai Hansen, Kristoffer Knutsen Wickstrøm
Communications in Computer and Information Science, vol 2576. Springer 2025
October 12, 2025

