Inria Montpellier, St-Priest Campus, Building 5, Room 03.124
Machine Learning in Montpellier, Theory et Practice – Pierre-Alexandre Mattei Inria (Maasai)
Ensemble methods combine predictions from various statistical learning models. Their most famous representatives are random forests or deep ensembles. This talk will center around the question: « How many models should I aggregate? »
We will see that the answer depends on the chosen performance metric. Specifically, in the case of convex losses (such as cross-entropy in classification or mean squared error in regression), the error is a decreasing function of the number of models. In the case of non-convex losses (such as classification error in classification or the Fréchet Inception distance in generative modelling), things are more nuanced, and the error can sometimes be non-monotonic.
These results will be illustrated with examples of neural network ensembles, both for classification and generative modelling. This work is notably based on the papers:
– Are Ensembles Getting Better All the Time? (with Damien Garreau), JMLR 2025
– When Are Two Scores Better Than One? Investigating Ensembles of Diffusion Models (with Raphaël Razafindralambo, Rémy Sun, Frédéric Precioso, and Damien Garreau), TMLR 2026
– Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means (with Raphaël Razafindralambo, Rémy Sun, Frédéric Precioso, and Damien Garreau), arXiv 2026

