4.0 CHAPTER
Updated chapter 4
4.1 DESCRIPTIVE STATISTICS
Used to shade light into the basic features of the data in a research and provide simple summaries. Help us to simply large amounts into an under stable way. The following is the descriptive statistics for Safaricom share prices
STATISTIC VALUE
Mean 15.26305
minimum 5.3
1st quarter 12.2
median 15.55
3rd quarter 18.4
Maximum 28.0
variance 28.2893
Standard deviation(volatility) 5.318769
Range 22.7
4.2 BOX PLOTS OF SAFARICOM SHARE PRICES
Thestudy conducted quarterly box plots for each year for five years and compared the various box plots. Box plots are five number summary (minimum,1st quartile, median,3rd quartile and maximum value) and they are used to identify outliers in the dataset. They are also used to determine skewness and the degree of dispersion. The following are various box plots for our share returns.
BOXPLOT 2013 QUARTERLY
BOXPLOT 2014 QUARTERLY
BOXPLOT 2015 QUARTERLY
BOXPLOT 2016 QUARTERLY
BOXPLOT 2017 QUARTERLY
The above figures represent the boxplots for years 2013,2014,2015,2016 and 2017.
We considered the first quarters for each year and compared. Clearly, 2017 had the highest mean, followed by 2016,2015,2014 and 2013 respectively.
Observing the skewness,2013,2014 has a positive skewness while 2015,2016 and 2017 were negatively skewed as confirmed by the plot of the density of the first quarter.
For the second quarter ,the study considered the second quarters for each year and compared them. Clearly, 2017 had the highest mean, followed by 2016,2015,2014 and 2013 respectively.
Observing the skewness,2013,2014 has a negative skewness while 2015,2016 has a positive skewness and 2017 has a negative skewness as confirmed by the plot of the density of the second quarter
The study also considered the third quarters for each year and compared them. Clearly, 2017 had the highest mean, followed by 2016,2015,2014 and 2013 respectively.
Observing the skewness,2013 has a positive skewness while 2014,2015 and 2016 has a negative skewness and 2017 has a positive skewness as confirmed by the plot of the density of the third quarter
The study also considered the fourth quarters for each year and compared them. Clearly, 2017 had the highest mean, followed by 2016,2015,2014 and 2013 respectively.
Observing the skewness,2013 has a positive skewness while 2014,2015,2016 and 2017 had a negative skewness as confirmed by the plot of the density of the fourth quarter.
4.3 TIME PLOT OF SAFARICOM SHARE PRICES
We conducted a time plot of Safaricom share prices. The test was done to establish whether the prices were non-stationary or not. The figure below shows our time plot.
From the figure, the time plot displays a significant trend, i.e non-stationarity which agrees with empirical findings that financial time series prices are non-stationary. Therefore, it may be interpreted that there has been a significant increment in Safaricom share prices since 2013.This could be attributed to the significant role that Safaricom company has made to the Kenyan economy through innovation and therefore this might have attracted more investors to invest in the company thereby increasing the company’s demand for the shares. Having found that the prices were non -stationary, we conducted another test to confirm this finding ACF and PACF test.
4.4 ACF AND PACF PLOT OF SHARE PRICES
This test argues that if there are serial correlations among the data points, the ACF function will have positive values for a large number of lags. In other words, if there is significant serial correlation in the data, the lags will not be within the confidence lines. The following is our ACF and PACF plot output display
The ACF plot above clearly indicates that the data contains significant serial correlation since the lags are not within the confidence lines and therefore it is non-stationary. In addition, Augmented Dickey Fuller Root -Unit test was conducted to furtherly affirm that the share prices were non -stationary.
4.5 AUGMENTED DICKEY FULLER TEST
This is a root unit test for stationarity.
The test stipulates that if the p-value resulting from this test is greater than 0.05, then the data is non-stationary and if the p-value is less than 0.05, the data is stationary. The test has the following hypothesis:
H_0: Data has a unit root (Non -stationary)
H_1 : Data has no unit root (Stationary)
The test statistic of ADF test is a t-statistic that’s shown below
.
t=α/(s.e(α))
ADF summary
Data Returns. Price
Dickey Fuller Constant=-2.0451
Lag order=10
p-value=0.5592
Alternative hypothesis Stationary
Since the p-value is greater than 0.05 we fail to reject the null hypothesis and conclude that the data is not stationary/has a unit root.
Therefore, from the three tests above, it clear that the Safaricom share prices were found non-stationary which called for intervention to remove the non-stationarity. One way to remove this non-stationarity is to find the log returns of the price data because it is a stylized fact that the returns are stationary.
Computation of returns:
r_t=ln (x_t⁄x_(t-1) ).
We had to conduct some tests to confirm the earlier hypothetical say that returns are stationary and the tests were as follows;
4.6 Test of autocorrelation of returns
We first conducted a test of ACF of the returns which displayed the results shown below,
From the plot of ACF above it was clear that the Returns were stationary since the lags of ACF plot are within the confidence lines indicating non-significant serial correlation.
We conducted ADF test to confirm this sentiment.
4.7 ADF test of returns
This is the test for stationarity.
The test argues that if the p-value resulting from this test is greater than 0.05, then the data is non-stationary and if the p-value is less than 0.05, the data is stationary. The test has the following hypothesis:
H_0: Returns has a unit root (Non -stationary)
H_1 : Returns has no unit root (Stationary)
The test statistic of ADF test is a t-statistic that’s shown below
t=α/(s.e(α))
Our ADF results for returns was obtained as shown below;
Data Returns
Dickey-Fuller Constant=-12.231
Lag order = 10
p-value = 0.01
alternative hypothesis stationary
Warning message In daftest (RETURNS): p-value smaller than printed p-value
The p -value of the test was less than 0.05 hence we rejected our null hypothesis and concluded that the returns were indeed stationary, a similar decision we drew under the ACF plot.
We also plotted the time plot of returns to observe the behavior of returns with time.
4.8 Time plot of returns.
To further study or confirm that the returns were stationary, we had a time plot of returns and observed the trend. The plot displayed the results below;
Clearly, from the plot, one is not in position to determine whether the trend of the returns is moving upward, downward or constant. Therefore, there is no trend in the returns and hence the returns are stationary.
Having fully confirmed that the returns were stationary, a normality test was done.
4.9 NORMALITY TEST
In financial time series, it is a stylized fact that we assume data is non -normal to allow for heavy tails distribution, which is measured by kurtosis. To test for normality of the returns, we use quantile to quantile plots (QQ-plots). For data to be normal, the distribution of the dots should be spread along the straight line. The following is our QQ-plot for the returns;
Clearly, from our hypothetical statement, the returns were non-normal hence legit for use in the modelling.
5.0 MODEL ESTIMATION FOR THE SHARE PRICES RETURNS.
To identify the ARIMA model order the study split the datapoints into the trial data and forecasting data points. As a stylized fact in financial time series75% of the data points were used as trial, i.e. the study used 75% of the data find the model order. A plot of ACF was used to find the order of the moving average and PACF was used to find the order of AR of the 75% of our data points.
Using the ACF and PACF plots above, various combinations of ARIMA (pdq) with their respective AICS were recorded as shown below;
ARIMA(PDQ) IC=AIC
(1,1,1) -5301.81
(1,1,4) -5300.65
(1,1,18) -5292.12
(2,1,1) -5304.67
(2,1,4) -5301.19
(2,1,18) -5299.11
(3,1,4) -5302.7
(3,1,18) -5295.8
(3,1,1) -5306.57
The study choose the model with least AIC which was found to be ARIMA (3,1,1) From the plot of ACF and PACF ,it is clear that ARIMA(3,1,1) has the least AIC .
The fitted ARIMA (3,1,1) model is displayed In the following output.
DATA Prices (log return price);937 datapoints
MODEL ARIMA(3,1,1)
Coefficients AR1= 1.1200
AR2=-0.2399
AR3=0.0018
MA1=-0.9194
Drift(Coefficients) 0.0153
Standard error AR1=0.0519
AR2=0.0497
AR3=0.0351
MA1=0.0403
Drift(Standard errors) 0.0043
sigma^2 estimated 0.03702
log likelihood 217
AIC -421.99
AICc -421.9
BIC -392.94
The fitted ARIMA(3,1,1) model becomes:
R_t=1.12R_(t-1)-0.2399R_(t-2)+0.0018R_(t-3)-0.9194Q_(t-1)+e_t.
5.1VALIDATION OF THE MODEL.
To test for the model fitness, we employed the following tests;
5.1.1LJUNG BOX STATISTIC
The model fitness was evaluated through the use of Ljung-box test as follows:
The hypothesis:
H_0: The model fits the data well.
H_a: The model does not fit the data
TEST LJUNG BOX
DATA RESID
X-Squared(chi-square) 1.1307e-06
Degrees of Freedom 1
p-value 0.9992
Since the p-value is greater than 0.5,the study failed to reject null hypothesis hence our model fits the data well.Also ,the test for adequacy was also found using the plot of ACFs of the residuals.
5.1.2 ACFS OF THE RESIDUALS
The adequacy test for the model was also done by plotting ACFs of the residuals.
For our model to be adequate, there is supposed to be non- significant serial correlation among the residuals of the returns.Also,the residuals of the model should be White Noise. The following was the plot ;
Clearly, the residuals are uncorrelated with a minor serial correlation at lag 1.The study also sort to validate the distribution of the error terms by first plotting the density and the following figure displays this study;
Clearly the mean of the error term is symmetry at zero ,which is a special feature of White Noise error term. Then validation of normality of the error term was conducted and the results obtained as follows.
Clearly the error term is normally distributed as a proof of White Noise property. Since the error terms of the model are uncorrelated, normal and are white noise, then our model is fit to forecast.
5.2 FORECAST.
The study used the remaining 25%(312) of the data points to forecast and the following is the sample of the first 30 forecasted values the fi forecasted values;
FORECASTED VALUES
19.94978 ,19.92717 ,19.90268 ,19.88162 ,19.86474, 19.85155 ,19.84136, 19.83352,
19.82749 ,19.82286 ,19.81931, 19.81658 ,19.81449 ,19.81288, 19.81165 ,19.81070,
19.80997 ,19.80941, 19.80899 ,19.80866 ,19.80841, 19.80821 ,19.80806 ,19.80795,
19.80786 ,19.80780, 19.80774, 19.80770 ,19.80767 ,19.80765
The first 30 actual prices and forecasted prices were compared as shown below
ACTUAL PRICES FORECASTED PRICES
20.00 19.949778
20.00 19.92717
20.00 19.90268
19.95 19.88162
20.25 19.86474
20.25 19.85155
20.00 19.84136
19.90 19.83352
19.95 19.82749
19.95 19.82286
19.85 19.81931
19.75 19.81658
19.70 19.81449
19.70 19.81288
20.00 19.81165
20.00 19.81070
20.00 19.80997
19.95 19.80941
19.90 19.80899
19.85 19.80866
20.25 19.80841
20.75 19.80821
21.25 19.80806
21.25 19.80795
21.00 19.80786
20.75 19.80780
20.25 19.80774
20.75 19.80770
20.75 19.80767
20.25 19.80765
We also plotted the forecasted prices and the actual prices and the study displayed the following graphs;