Image:
Torger Grytå / Petter Bjørklund / Private

VI Seminar #86: Optimizing Deep Learning: Automated Methods for Efficient Training and Architecture Design

The program will be available shortly. Please check back later.

VI Seminar #86: Optimizing Deep Learning: Automated Methods for Efficient Training and Architecture Design

Presented by Aaron Klein, Project Leader at OpenEuroLLM, Ellis Institute Tübingen

Abstract

Training deep neural networks still relies heavily on selecting the right hyperparameters and making manual architectural decisions. This often leads to an inefficient trial-and-error process that is both computationally expensive and time-consuming.

This talk explores how automated machine learning techniques can make this pipeline faster, more efficient, and more reliable. We begin by introducing core methods in model-based hyperparameter optimization that automatically configure the training process of deep learning models. By leveraging advanced early-stopping and multi-fidelity strategies, we can substantially accelerate the overall optimization procedure.

In the second part of the talk, we discuss recent developments in neural architecture search aimed at automating architectural design choices. We show that these methods can discover architectures that optimally trade off performance and efficiency, including metrics such as latency and energy consumption. Finally, we demonstrate how neural architecture search can be used to identify strong initializations for pre-training small language models.

This seminar is open for members of the consortium. If you want to participate as a guest, please sign up.

Sign up here