VI seminar #45 – Data-informed distributions for sampling neural network weights

Data-informed distributions for sampling neural network weights

Presenter: Erik Bolager, PhD Candidate at Technical University of Munich

Abstract: Random feature methods construct internal neural network weights by randomly sampling them, typically from a Gaussian distribution. Thus, even in supervised learning problems, these methods do not utilize the information from the available data. Other attempts with more informed distributions, for example Bayesian neural networks, require a lot of computational power. In this talk, we present a method to construct the weights and biases of the hidden layers strictly from the space X ×X, where X is the domain of the underlying function, and then present a probability distribution over X ×X that also uses the information of the function we approximate. By sampling weights and biases of the hidden layers in this way and then solving the linear system to obtain the parameters of the last layer, we can construct accurate neural networks in a gradient-free way. The construction is possible for shallow and deep feed forward neural networks. We will present several theoretical results, including that even though we limit the space of weights and biases in the hidden layers, we do not limit the space of functions we can approximate, under mild assumptions on X. We also consider the opposite by providing an example of input spaces that break these assumptions and discuss why the networks fail to approximate certain functions. We also show empirical results when applying the framework to different tasks, including transfer learning on images. We end the talk by discussing its potential use for both interpretability and in the field of visual intelligence.

