Eirik Østmo / Torger Grytå

VI seminar 2022 #17 The 99% accuracy club

The program will be available shortly. Please check back later.

The 99% accuracy club

Presenter: Kajsa Møllersen, Associate professor, Biostatistics, UiT, Department of Community Medicine.


A huge problem in the scientific community is the abuse of test sets, by modifying a method after test results, and then retesting and reporting the better result. This creates a serious problem: when state-of-the-art is achieved by re-using the test set, it is impossible to beat it when doing things properly.  

Competitions like kaggle make sure that the test set is truly independent, and good performance is not a result of fitting the method to the test. This solves the problem of test-set abuse, but creates a new one similar to multiple testing, and can be seen as an example of regression to the mean.  

When kaggle launches a competition with tens of thousands of dollars in prize money, the huge number of participating teams, and the hours each team put in, result in high performance for the submitted methods, and a new state-of-the-art is established.  

For the 2020 Melanoma Classification competition hosted by kaggle, 33,126 images were made available for training (of which 2% were melanomas), and an additional 10,982 were used for final ranking of the 3,308 teams who entered the competition with eyes on the 10,000$ prize. With this as an example, I will demonstrate which problems arise, and also suggest a solution.

This seminar is open for members of the consortium. If you want to participate as a guest please sign up.

Sign up here