Íàçàä â áèáëèîòåêó
Àâòîðû: A. Lendasse, J. Lee, E. de Bodt, V. Wertz, M. Verleysen
Èñòî÷íèê:
http://www.dice.ucl.ac.be/~verleyse/papers/bookLesage03al.pdf
APPROXIMATION BY RADIAL BASIS FUNCTION NETWORKS
Application to Option Pricing
INTRODUCTION
The approximation of functions is one of the most general uses of artificial neural networks. The general framework of the approximation problem is the following. One supposes the existence of a relation between several input variables and one output variable. This relation being unknown, one tries to build an approximator (black box model) between
these inputs and this output. The structure of this approximator must be chosen and the approximator must be calibrated as to best represent the input-output dependence. To realize these different stages, one disposes of a set of inputs-output pairs that constitute the learning data of the
approximator.
The most common type of approximator is the linear approximator. It
has the advantage of being simple and cheap in terms of computation load,
but it is obviously not reliable if the true relation between the inputs and the
output is nonlinear. One has then rely on nonlinear approximators such as
artificial neural networks.
The most popular artificial neural networks are the multilayer
perceptrons (MLP) developed by Werbos [1] and Rumelhart [2]. In this
chapter, we will use another type of neural networks: the radial basis
function networks (or RBFN) [3]. These networks have the advantage of
being much simpler than the perceptrons while keeping the major property
of universal approximation of functions [4]. Numerous techniques have
been developed for RBFN learning. The technique that we have chosen has
been developed by Verleysen and Hlavackova [5]. This technique is
undoubtlessly one of the simplest ones but it gives very good results. The
RBFN and the learning technique chosen will be presented in section 1.
We will demonstrate that the results obtained with RBFN can be
improved by a specific pre-treatment of the inputs. This pre-treatment
technique is based on linear models. It does not complicate the RBFN
learning but yields very good results. The pre-treatment technique will be
presented in section 2.
These different techniques will be applied to option pricing. This
problem has been successfully handled by for instance Hutchinson, Andrew
and Poggio in 1994 [6], a work that has surely widely contributed to give
credibility to the use of artificial neural networks in finance. The existence
of a chapter dedicated to neural networks in the work of Lo, Cambell and
MacKinlay [7] sufficiently attests to it. Hutchinson et al., by using notably
simulated data, have demonstrated that RBFN allow to price options, and
also to form hedged portfolios. The choice that the authors made of the
determination of a call option price as application domain of neural networks
in finances is certainly not an accident. The financial derivatives assets
indeed characterize themselves by the nonlinear relation that links their
prices to the prices of the underlying assets. The results that we obtain are
comparable to those of Hutchinson et al. but with a simplified learning
process. We will demonstrate with this example the advantages of our
technique of data pre-treatment. This example will be handled in detail in
section 3.
1. APPROXIMATION BY RBFN
We dispose of a set of inputs Xt and a set of outputs Yt. The
approximation of y, by a RBFN will be noted . This approximation will be
the weighted sum of m Gaussian kernels Φ :
t = 1 to N , with
The RBFN is illustrated in figure 1.
The complexity of a RBFN is determined by the number of Gaussian
kernels. The different parameters to specify are the position of the Gaussian
kernels (Ci), their variances (σi), and the multiplicative factors (λi). The
technique that allows determining them is developed in detail in [5]. We
will explain it briefly.
The position of the Gaussian kernels is chosen according to the
distribution of Xt in space. At locations where there are few inputs Xt few
nodes will be placed and conversely, a lot of nodes will be placed where
there are many input data.
The technique that allows realizing this operation is called vector
quantization and the points that summarize the position of the nodes are
called centroids. The vector quantization is composed of two stages. The
centroids are first randomly initialized in the space. They are then placed in
the following way. All Xt points are inspected, and for each of them the
closest centroid will be moved in the direction of xt according to the
following formula:
with xt the considered point, Ci the closest centroid to Xt, and α a parameter
that decreases with time. Further details on vector quantization methods can
be found in [8,9].
The second parameter to be chosen is the standard deviation (or width) of
the different Gaussian kernels (σi). We chose to work with a different width
for each node. To estimate them, we define the Voronoï zone of a centroid
as the space region that is closest to this centroid than to any other centroid.
In each one of these Voronoï zones, the variance of the points belonging to
that zone is calculated. The width of a Gaussian kernel will be the product
of the variance in the Voronoï zone where the node is located, multiplied by
a factor k. We will explain in our application how to choose this parameter
[10]. This method has several advantages, the most important being that the
Gaussian kernels better cover the space of the RBFN inputs.
The last parameters to determine are the multiplicative factors λi. When
all other parameters are defined, these are determined by the solution of a
system of linear equations.
The total number of parameters equals m*(n+1)+1 with n being the
dimension of the inputs space and m being the number of Gaussian kernels
used in the RBFN.
2. RBFN WITH WEIGHTED INPUTS
One of the disadvantages of the RBFN that we have presented is that they
give an equal importance to all input variables. This is not the case with
other approximators of functions such as the MLP. We will try to eliminate
this disadvantage without penalizing the parameters estimation process of
the RBFN.
Let’s suppose first that all inputs are normalized. We understand by this
that they all have zero mean and unit variance. If we build a linear model
between the inputs and the output, the latter will be approximated by a
weighted sum of the different inputs. The weighting associated to each input
determines the importance that this latter has on the approximation of the
output. Indeed, if one differentiates the linear model with respect to the
different inputs, one finds back these very same weightings. This is
illustrated in the following example:
which yields:
We thus dispose of a very simple mean to determine the relative
importance that the different inputs have on the output.
We will then multiply the different normalized inputs by the weighting
factors obtained from the linear model. These new inputs will be used in a
RBFN such as the one we presented in the previous section. This new
RBFN that we will qualify as «weighted», will thus give a different
importance to the different input variables.
3. OPTION PRICING
The initial success of neural networks in finance has most surely been
motivated by the numerous applications presented in the field of assets price
prediction (Cottrell, de Bodt and Levasseur [11] present a wide synthesis of
obtained results in this field). The emergence of nonlinear prevision tools
and their universal approximation properties, obviously not well understood,
brought in new hopes. Quickly though, it appeared that forecasting the price
of assets remains an extremely complex problem, that the concept of
financial markets informational efficiency introduced by Fama [12] is no
idle words; to overperform financial markets, after having taken account of
the transaction costs and the level of risk taken is not simple.
The application studied in this section, the modeling of the behavior of
the price of a call option, as developed by Hutchinson, Lo and Poggio [6],
presents a typical case of application of neural networks in finance. The
prices of the derivatives depend nonlinearly on the price of the underlying
assets. Major advances have been introduced in finance to set up analytical
evaluation formulas for assets derivatives. The most famous is undoubtedly
the one established by Black and Scholes [13], daily used nowadays by the
majority of financial operators. Evaluation formulas of options prices are
based on very strict assumptions among which, for example, the fact that the
actions prices follow a geometric Brownian motion. The fact that these
assumptions are not strictly verified in practice explains that the prices
observed on financial markets deviate more or less significantly from the
theoretical prices. In this context, to dispose of a universal function
approximator, capable of capturing the nonlinear relation that links an option
price to the price of its underlying asset, but that does not rely on the
assumptions necessary for the setting up of analytic formulas, presents an
obvious interest. It is though necessary that the proposed tool be reliable and
robust for it to be adopted by the financial community. This is indeed our
major concern.
3.1 Generating data
The RBFN with weighted inputs has been tested on an example of
determination of a call option price. This example has been handled by
Hutchinson, Lo and Poggio in 1994 [6], and we will use the same method of
generation of data.
To generate their data, the authors use in their article the Black and
Scholes formula [13] in order to simulate the call option prices. This
formula is the following:
with
and
In the above formulas, C(t) is the option price, S(t) the stock price, X the
strike price, r the risk-free interest rate, T-t the time-to-maturity, σ the
volatility and Φ is the standard normal distribution function. If r and s are
stable, which is the case in our simulations, the price of the call option will
only be function of S(t), X and T-t. The approximation type that has been
chosen is the following:
For our simulation, the prices of the option during a period of two years
will be generated, in a classical way, by the following formula:
taking the number of working days by year equal to 253, and Zt a random
variable extracted from a normal distribution with µ = 0.10/253, and
variance σ2
= 0.04/253. The value S(0) equals 50 US$.
The strike price X and the time-to-maturity T-t are determined by the
rules of the «Chicago Board Options Exchange» (CBOE) [14]. In short, the
rules are the following:
1. The strike price is a multiple of 5$ for the stock prices between 25 and
200$;
2. The two closest strike prices to the stock prices are used at each
expiration of options;
3. A third strike price is used when the stock price is too close to the strike
price (less than one dollar);
4. Four expiration dates are used: the end of the current month, the end of
the next month and the end of the next two semesters.
3.2 Performance measures
Three performance measures will be used, as in Hutchinson et al. [6].
The first measure is the determination coefficient
between C et . The
two other performance measures are the tracking error ξ and the prediction
error η. These errors are defined as follows:
with V(t) being the portfolio value at time t, Vs the stock value, Vb the
obligations value, and Vc the option value. If the option price is correctly
evaluated, V(T) should at any time be equal to zero, given that it is a fully
hedged portfolio. The more the tracking error (ξ) deviates from 0, the more
the option price deviates thus from its theoretical value. The prediction error
is based on the classical formula of variance decomposition (the variance is
equal to the difference between the expectation of the squared variable and
its squared expectation). The expected squared V(T), in other words the
prediction average quadratic error, equals thus the sum of its squared
expectation and its variance. The terms represent the actualization terms
in continuous time, allowing the addition of obtained results at different
moments in time. A more detailed explanation of these criteria can be found
in [6].
3.3 Results
In order to measure the quality of the results obtained by classical and
weighted RBFN, we have simulated a price sample of a duration of 6 months
(using formulas (7) et (11)). Two RBFN are calibrated on these data: a
classical RBFN and a weighted RBFN. The number of Gaussian kernels is
6. This corresponds in fact to 19 parameters per RBFN, which is roughly
equivalent to the 20 parameters RBFN used in [6].
Then, one hundred test-sets are generated (using the same formulas), and
for each of the two RBFN the coefficient is calculated. The values of ξ
and η obtained for the two RBFN and for the exact Black and Scholes
formula are also calculated.
The results obtained for (averaged on the one hundred test-sets) are
presented in Figure 4, as a function of k, the coefficient used to compute the
width of the nodes. The value of k to be used is chosen as the smallest value
giving a result (in terms of ) close to the asymptote, that is to say a value
that can be found in the elbow of the curves in Figure 4. The value of k = 4
has been chosen in this case.
The benefit of weighting is obvious. The obtained exceeds 97%,
which is equivalent to the results in [6] while using a RBFN with much
simpler learning process.
The results obtained for ξ and η are also in favor of the weighted RBFN.
Table 1 presents the average values and the standard deviations of , ξ and
η for both types of RBFN. As for the performance measures for the Black
and Scholes exact formula, we have ξ = 0.57 et η = 0.85.
4. CONCLUSIONS
In this paper, we have presented a simple method to parameterize a
RBFN. We have then proposed an improvement to this classical RBFN.
This improvement consists in the weighting of inputs by the coefficients
obtained through a linear model. These methods have then been tested for
the determination of the price of a call option. The results that we have
obtained show a clear advantage of the weighted RBFN whatever the
performance measure used. In addition, in the example used, the results are
comparable to the best RBFN or multilayer perceptrons that can be found in
literature. The advantages of this weighted RBFN are thus simplicity of
parameterization and quality of approximation.
ACKNOWLEDGMENTS
Michel Verleysen is Senior research associate at the Belgian Fonds
National de la Recherche Scientifique (FNRS). The work of John Lee has
been realized with the support of the Ministère de la Région wallonne, in the
framework of the Programme de Formation et d’Impulsion à la Recherche
Scientifique et Technologique. The work of A. Lendasse and V. Wertz is
supported by the Interuniversity Attraction Poles (IAP), initiated by the
Belgian Federal State, Ministry of Sciences, Technologies and Culture. The
scientific responsibility rests with the authors.
REFERENCES
[1] Werbos P. (1974), “Beyond regression: new tools for prediction and analysis in the
behavioral sciences”, PhD thesis, Harvard University.
[2] Rumelhart D., Hinton G., Williams R. (1986), “Learning representation by back-
propagating errors”, Nature 323, pp. 533-536.
[3] Powell M. (1987), “Radial basis functions for multivariable interpolation : A review”, J.C.
Mason and M.G. Cox, eds, Algorithms for Approximation, pp.143-167.
[4] Poggio T., Girosi F. (1987), “Networks for approximation and learning”, Proceedings of
IEEE 78, pp. 1481-1497.
[5] Verleysen M., Hlavackova K. (1994), “An Optimised RBF Network for Approximation of
Functions”, ESANN 1994, European Symposium on Artificial Neural Networks, Brussels
(Belgium), pp. 175-180.
[6] Hutchinson J., Lo A., Poggio T. (1994), “A Nonparametric Approach to Pricing and
Hedging Securities Via Learning Networks”, The Journal of Finance, Vol XLIX, N°3.
[7] Cambell, Y., Lo, A., MacKinlay, A. (1997), The Econometrics of Financial Markets,
Princeton University Press, Princeton.
[8] Kohonen T. (1995), “Self-organising Maps”, Springer Series in Information Sciences, Vol.
30, Springer, Berlin.