Organisers: Barry Koren (TU Eindhoven), Wil Schilders (TU Eindhoven)
Anastasia Borovykh (CWI)
Understanding generalisation in noisy time series forecasting
In this presentation we study the loss surface of neural networks for noisy time series forecasting. In extrapolation problems for noisy time series, neural networks, due to overparametrization, tend to overfit and the behavior of the model on the training data does not measure accurately the behaviour on unseen data due to e.g. changing underlying factors in the time series. Avoiding overfitting and finding a pattern in the data that persists for a longer period of time can thus be very challenging. In this talk we quantify what the neural network has learned using the structure of the loss surface of multi-layer neural networks. We discuss how to use the learning algorithm to control the trade-off between the complexity of the learned function and the ability of the function to fit the data. Furthermore, we gain insight into which minima are able to generalise well based on the spectrum of the Hessian matrix and the smoothness of the learned function with respect to the input.