NEURO-ADAPTIVE MODEL FOR FINANCIAL FORECASTING


Romanian Journal of Economic Forecasting – 3/2007
Institute of Economic Forecasting


Iulian NASTAC
Emilian DOBRESCU
Elena PELINESCU


Source of information: http://www.ipe.ro/rjef/rjef3_07/rjef3_07_2.pdf




ABSTRACT

    The paper advances an original artificial intelligence-based mechanism for specific economic predictions. The aim is to forecast the exchange rate of euro versus the Romanian currency using a large set of financial data. The possible influence of specific forecasting indicators (such as Sibiu Futures Stock Exchange market) on the evolution of the exchange rate in Romania is also analyzed. The time series under discussion are inherently non-stationary. This aspect implies that the distribution of the time series changes over time. The recent data points could provide more important information than the far distant data points. Therefore, we propose a new adaptive retraining mechanism to take this characteristic into account. The algorithm establishes how a viable structure of an artificial neural network (ANN) at a previous moment of time could be retrained in an efficient manner, in order to support modifications in a complex input-output function of a financial forecasting system. In this system, all the inputs and outputs vary dynamically, and different time delays might occur. A “remembering process” for the former knowledge achieved in the previous learning phase is used to enhance the accuracy of the predictions.
    The results show that the first training (which includes the searching phase for the optimal architecture) always takes a relatively long time, but then the system can be very easily retrained, since there are no changes in the structure. The advantage of the retraining procedure is that some relevant aspects are preserved (“remembered”) not only from the immediate previous training phase, but also from the previous but one phase, and so on. A kind of “slow forgetting process” also occurs; thus for the ANN it is much easier to remember specific aspects of the previous training instead of the first training.
    The experiments reveal the high importance of the retraining phase as an upgrading/updating process and the effect of ignoring it, as well. There has been a decrease in the test error when successive retraining phases were performed and the neural system accumulated experience.

    Keywords: Neural networks, exchange rate, adaptive retraining, delay vectors, iterative simulation

2. THE MODEL ARCHITECTURE

    The time delay or dead time is frequently encountered in financial systems. It is well known that feedback control in the presence of time delay leads to particular difficulties, since a delay places a limit on the time interval.
    Figure 1 shows our idea of training a feed-forward ANN such that the latter becomes a predictor. We use delayed rows of more than 30 input data (see the final part of this section) to simulate the current states of the EUR/ROL exchange rate. For learning purposes, the network inputs involve many blocks with several time-delayed values of financial system inputs, and fewer blocks with system delayed output. The ANN target-output consists of the current value of the corresponding EUR/ROL exchange rate. Therefore, the system tries to match the current values of the output, by properly adjusting a function of the past values of the inputs and output (Figure 1).
    At the current moment, t, the output (see Figure 1) is affected by the P inputs at different previous time steps
(t - i_d1, …, t - i_dn), and also by the outputs at other previous time steps (t - o_d1, …, t - o_dm), respectively. We denote by In_Del and Out_Del two delay vectors that include the delays that we take into account:

In_Del=[i _ d1,i _ d2 ,...,i _ dn]          (1)

Out_Del=[0 _ d1,0 _ d2 ,...,0 _ dn]          (2)


     where n > m.

    For In_Del, we use various delay vectors with n = 7, 8 or 9 elements, whose values are within a range of twenty days. Regarding Out_Del, we employ different combinations, with m = 3, 4 or 5 elements, covering about one week. The distribution of the vector elements is preferably (but not compulsory) chosen similarly to the Gamma distribution. The elements of each vector are in ascending order. Consequently, the maximum values of any delay vector are i_dn or o_dm, respectively. The recurrent relation performed by the model is as follows:

y(t + 1) = F(X (t + 1 - In_Del(i)), y(t - Out_Del( j)))      (3)

    where X is the input vector; i =1,...,n and j =1,...,m .
    We use feed-forward ANNs with two hidden layers in order to achieve a good approximation function, based on our preliminary research, where we have obtained better results in the case of two hidden layers than in the case of one hidden layer, however maintaining a similar ratio (approx. 5/1) between the number of training samples and the total number of weights. The ANN models, depicted in Figure 1, use training sets of V-i_dn input-output pairs for model adaptation (see the next section), where V = 2240 is the initial time steps interval employed for the training purpose. Once we have established all the influences on the output at the moment t, we apply Principal Component Analysis (PCA) (Jackson 1991) to reduce the dimensionality of the input space and to un-correlate the inputs. Before applying PCA, we preprocessed the inputs and outputs, by replacing the missing data using the previously available values and, then, by applying the normalization. Data preprocessing prepares the raw data for the forecasting model and turns it into a format that will be easier and more effectively processed. Finally, we have applied the reverse process of normalization, in order to de-normalize the simulated outputs. Data preprocessing and data postprocessing are essential steps of the knowledge discovery process in the real world applications, and they greatly improve the network’s ability to capture valuable information, if they are correctly carried out (Hagan et al., 1996, Basheer et al., 2000).
    Our attempt involves a number of P variables (more than 30). Statistical data have different frequencies, such as:
  • daily frequency (forex exchange rate, future exchange rate for one month and the BET index);
  • quarterly frequency (GDP, the share of consolidated budget in GDP);
  • monthly frequency (CPI, interest rate, exports and imports of goods and services, etc).
    In order to use all these data with different frequencies, we decided to transform them into such data of higher frequency, on the basis of a natural formation mechanism of the market operators’ behavior that implied to keep the information unchanged during the period between two updating time steps. For example, if we have only annual data, we keep it unchanged during 365 or 366 days. For the days without transactions, we decided to keep the previous transaction figure, in order to have a complete data series.
    In our model we used two different kinds of data:
  • Statistical data that were classified as:
    • General data that characterizes the macroeconomic development of Romania (9 indicators);
    • Specific data that are directly linked to the exchange rate evolution (11 indicators);
    • External data that refer to significant indicators of external market evolution, focused on the European Union market and the US market (6 indicators).
  • Forecasting data that were also classified as:
    • General data that characterize the macroeconomic development of Romania (10 indicators);
    • Specific data that are directly related to the exchange rate evolution (4 indicators);
    • There is a possibility to use External data also, but we decided to employ them in a further investigation that is not the subject of this paper.
    All the previous indicators used in the model are presented in Table 1.
    • Additionally, we introduced a Month indicator L (the days of January are denoted by 1, the days of February by 2 and so on).
    • We also tested the influence of the other three supplementary inputs that represent the “Sibiu Futures exchange rate of one month” for EUR/ROL, USD/ROL and EUR/USD exchange rates.
    The period covered by analysis is from the beginning of 2000 to the end of 2006. The possible connection between the exchange rate and other mentioned variables has been checked using the Granger causality tests (Granger 1969, Granger 1988), which were computed for different number of lags (starting from 26). As expected for a daily analysis, such interdependence is clearly revealed in the case of 1-2 lags, so that an artificial neural network based on the previously mentioned variables can be considered as economically consistent.

    5. CONCLUSIONS

        The ANNs ability to extract significant information from its training data provides a valuable framework for the representation of relationships that are present in the structure of the data. This allows for both the interpolation among the a priori defined points and the extrapolation outside the range bordered by the extreme points of the training set.
        The evaluation of the test error shows that the adaptive retraining technique can gradually improve, on the average, the achieved results. Our practical experience reveals that the first training (which includes the searching phase for the optimal architecture) always takes a relatively long time, but then the system can be very easily retrained, as there are no changes in the structure. The great advantage of the retraining technique is that some relevant aspects are preserved (remembered) not only from the immediate previous training phase, but also from the last but one phase, and so on. A kind of slow forgetting process also occurs; thus it is much easier for the ANN to remember specific aspects of the previous training instead of the first training. It means that the former information accumulated during the previous trainings will be slowly forgotten and the learning process will be adapted to the newest evolutions of the financial process.
        In the presented applications, the optimum shifting time for the next retraining is one day. This way the model can be quickly updated using the retraining procedure. Nevertheless, the graphs of the predictions show that the system can still provide correct values without retraining for several days but there is a major risk of loosing the unexpected changes in the financial environment.
        We remark that some supplementary parameters (like Sibiu Futures exchange rate of one month) did not improve the values of the ERR as we initially expected. The reason for this outcome is very probably the inconsistent quality of these three supplementary indicators that exceed sometimes the limits of an acceptable precision, and the real market did not take into account their influence. Our next approach will involve the one-month prediction in order to have a direct comparison with the Sibiu Futures Stock Exchange market. The results of forecasting the exchange rate for one day suggest that this technique could be extended to a large period of forecasting (a week, a month, 3 months, 6 months or more) without difficulties, and we intend to present these simulations in another paper.
        The final remark refers to the basic training algorithm. Even if the SCG algorithm is not the fastest algorithm, the great advantage is that this technique works very efficient for networks with a large number of weights. The SCG is something of a compromise; it does not require large computational memory, and yet it still has a good convergence and is very robust. Furthermore, we always apply the early stopping method (validation stop) during the training process, in order to avoid the over-fitting phenomenon. In addition, it is well known that for the early stopping one must be careful not to use an algorithm that converges too rapidly. The SCG is properly suited for the validation stop method. Nevertheless, it is quite easy to replace the SCG algorithm with another one, since the adaptive retraining technique is flexible and independent of the basic training algorithm.