Supervised learning is a fundamentally ill-posed problem. In practice, this indetermination is dealt with by imposing constraints on the solution; these are either implicit, as in neural networks, or explicit via the use of a regularization functional. In this talk, I present a unifying perspective that revolves around a new representer theorem that characterizes the solution of a broad class of functional optimization problems. I then use this theorem to derive the most prominent classical algorithms — e.g., kernel-based techniques and smoothing splines — as well as their “sparse” counterparts. This leads to the identification of sparse adaptive splines, which have some remarkable properties.
I then show how the latter can be integrated in conventional neural architectures to yield high-dimensional adaptive linear splines. Finally, I recover deep neural nets with ReLU activations as a particular case.