How can our knowledge of asymptotics translate into control over the extrapolating behaviour of NNs?
Applications of artificial neural networks in quant finance have met various challenges, one of which is our current inability to control the extrapolation behaviour of NNs beyond the range of training points. In the working paper "Neural Networks with Asymptotics Control", Alexandre Antonov, Michael Konikov, and Vladimir Piterbarg demonstrate how the knowledge of asymptotics can be translated into control over the extrapolating behaviour of NNs, and in this article, Alexandre Antonov, Chief Analyst, Danske Bank, explores the key concepts supporting this paper.
Significant advances in machine learning (ML), deep learning (DL), and artificial neural networks (ANN or NN) in image and speech recognition fuelled a rush of investigations as to how these techniques could be applied in finance in general, and in derivatives pricing in particular. Typical examples of this genre include [AK], [MG], [HMT], [FG].
The main idea of these papers is to use NNs to speed up slow function calculations. A typical procedure involves training the NN offline on a sample of learning points calculated from the true model (or, in pre-NN language, fitting a functional form defined by an NN to a sample of function values over a collection of function arguments, often multi-dimensional), and then using the NN as an approximation to the true model during on-line pricing and risk management calculations.
It has generally been observed that NNs, once trained, do a good job interpolating between the points they were trained on (fitted to). However, extrapolation behaviour beyond the range of training points is not controllable in a typical NN, due to their complex non-parametric nature. In this paper, we provide a detailed description on this absence of extrapolation/ asymptotics control. Starting with an extensive intuition on the Kolmogorov-Arnold theorem underlying a standard feed-forward multi-layer NN, we demonstrate how an information about asymptotics is lost.
The absence of extrapolation control is a significant limitation of the NN approximation approach to financial applications. The obvious one is stress testing. Financial models often need to be evaluated with the values of input variables that are significantly different from the current market conditions. Changes of regime are common in financial markets. Moreover, input values in stress scenarios, required for sound risk management, would routinely fall outside the range of the training set, with unpredictable extrapolation. Needless to say, there are many other reasons why it is important to control extrapolation of NNs in financial applications.
One of the possible solutions to this problem is, of course, sampling the input variable space widely enough so that any possible future value of input variables falls within the sample range (interpolation) and never outside (extrapolation). It is not hard to see that this is not a fully satisfactory solution as one does not know a-priori what future values will be required. Additionally, large ranges of input values need a large number of learning points to cover, slowing down learning. More importantly, using large ranges for input variables would likely make the fit for moderate, i.e. non-extreme, values of inputs worse, as the NN would try to balance the quality of fit between all the training points.
Fortunately, for many financial applications, in addition to the ability to calculate function values for moderate values of inputs, we also often know asymptotics of these functions for large values of parameters. This is true for e.g. SABR fitting ([MG], [HMT]) and values of many types of products in derivatives pricing ([FG]). The aim of this paper is to demonstrate how the knowledge of asymptotics can be effectively translated into control over the extrapolating behaviour of NNs.
Namely, to approximate a multi-dimensional function while preserving its asymptotics we come up with two steps. As the first one, we find a control variate function that has the same asymptotics as the initial function. On step two, we approximate the residual function with a special NN that has vanishing asymptotics in all, or some, directions.
The apparent simplicity of the plan hides a number of complications that we overcome in this paper. Specifically, we make two critical contributions – our main technical results -- that make this programme work. For step one, we show how to construct a universal control variate, a multi-dimensional spline that has the same asymptotics as the initial function. For step two, we design a custom NN layer that guarantees zero asymptotics in all directions, with a fine control over the regions where the NN interpolation is used and where the asymptotics kick in.
In passing, we note that multi-dimensional interpolation is not the only application of NNs in quantitative finance; papers [KS], [BGTW], [GR], [HL] explore other applications that are beyond the scope of this paper. Still, they may benefit from some of the ideas presented here.
More details as well as intuitions and multiple illustrations can be found in our SSRN paper:A. Antonov, M. Konikov and V. Piterbarg, "Neural Networks with Asymptotics Control", 2020, SSRN working paper
[BGTW] H. Buehler, L. Gonon, J. Teichmann, B. Wood. "Deep Hedging", 2018, arxiv. Control Signal Systems (1989) 2: 303. [FG] R. Ferguson and A.D. Green, "Deeply Learning Derivatives", 2018, SSRN working paper [HL] P. Henry-Labordere. "CVA and IM: Welcome to the Machine", March 2019, Risk Magazine [HMT] B. Horvath, A. Muguruza and M. Tomas, ``Deep Learning Volatility'' 2019, SSRN working paper [AK] A. Kondratyev, "Curve dynamics with artificial neural networks", 2018, Risk [KS] A. Kondratyev, C. Schwarz. "The Market Generator", 2019, SSRN working paper, [MG] W. McGhee, "An Artificial Neural Network Representation of the SABR Stochastic Volatility Model" (2018), SSRN working paper [GR] G. Ritter. "Machine learning for trading". October 2017, Risk Magazine, pp.~84-89