Presented By: Richa Handa Asst. Professor
Contents:  Objective  Introduction  Intelligent Techniques for prediction  ANN Techniques  Stock Data  Feature Extraction  Technical Indicators  Feature Selection  Framework for stock market prediction  Result and Analysis  Wavelet techniques  De-noising stock data using SWT  Proposed Model: Hybridization of SWT and ANN  Result and analysis  Conclusion  References
Objective In this research work a framework is designed for an optimal stock data prediction to develop an intelligent decision support system. This developed system remove the non linearity that exist in financial time series data using some feature extraction and selection. For De-noising the data of extracted features SWT is used. These extracted de-noised features are apply to model of ANN and data mining techniques is used to get the accurate prediction of stock price.
Introduction The stock market is a complex and dynamic system with noisy, non-stationary and chaotic data series.  Prediction of a financial market is more challenging due to chaos and uncertainty of the system. Soft computing techniques are progressively gaining presence in the financial world. This research work describes the application of Artificial Neural Network (ANN) for the prediction of Stock Market using some technical indicators.. A new model with ANN and SWT is purposed with ranking based feature selection technique. Stationary Wavelet Transform(SWT) is used for de noising the data. Hybrid of SWT and ANN is used for stock market prediction for better accuracy.
Intelligent Techniques for Prediction: Intelligent techniques for stock market prediction Artificial Intelligent Techniques(ANN) Wavelet techniques Hybrid Technology Continuous Wavelet Transform(CWT) Discrete Wavelet Transform(DWT) Stationary Wavelet Transform(SWT) ANN Wavelet Transform
ANN Techniques: ANN Techniques Supervised Learning Unsupervised Learning KSOM RBFNEBPN
It is a supervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed- forward networks. Architecture of EBPN is given below: Error Back Propagation Network (EBPN):
Radial Basis Function (RBF) Neural Network:  Radial basis functions are powerful techniques for interpolation in multidimensional space. A RBF is a function which has built into a distance criterion with respect to a center. Architecture of RBFN as given below:
Stock Data  The data used in this study consist of BSE30 data collected from the historical data available on the website yahoo finance.  The actual data contains 6 features:  Date: The date of stock market data.  Open: The value of stock open on a particular date.  Close: The value of stock Close on a particular date.  Low: The lowest value of stock on a particular date.  High: The highest value of stock on a particular date.  Volume: Total number of units sold on a particular date.  This dataset encompasses five years data. The collected data is Non linear by nature, so preprocessing technique has been done to make the data smoother. For preprocessing of data some technical indicators are used suggested by some researchers.
Sample of Data MAC D histro gram 10 days EMA RSI D% ROC MFI %R close OBV AD CHO ATR ADX CCI PPO CMF 0.830 0.904 0.020 0.074 0.021 0.209 0.197 0.460 -0.469 0.041 -0.008 0.211 0.001 -0.008 0.000 -0.069 0.827 0.906 0.020 0.079 0.019 0.187 0.148 0.457 -0.710 0.057 0.009 0.207 0.001 -0.008 -0.001 -0.072 0.828 0.905 0.020 0.071 0.029 0.193 0.120 0.450 -0.471 0.074 0.000 0.248 0.001 -0.007 -0.001 -0.071 0.831 0.905 0.018 0.062 0.025 0.198 0.098 0.446 0.477 0.076 -0.004 0.284 0.001 -0.006 -0.001 -0.071 0.833 0.907 0.017 0.057 0.006 0.206 0.110 0.443 0.716 0.104 0.009 0.299 0.001 -0.004 -0.001 -0.072 0.833 0.916 0.018 0.062 0.023 0.202 0.183 0.445 0.472 0.212 0.027 0.329 0.001 -0.006 -0.002 -0.069 0.831 0.920 0.018 0.069 0.030 0.197 0.179 0.455 0.227 0.255 0.017 0.366 0.001 -0.007 -0.006 -0.067 0.831 0.925 0.015 0.080 0.051 0.228 0.211 0.455 0.474 0.155 -0.009 0.424 0.001 -0.007 0.004 -0.068 0.833 0.928 0.011 0.085 0.039 0.267 0.217 0.459 0.493 0.071 -0.010 0.467 0.001 -0.002 0.001 -0.065 0.838 0.928 0.013 0.100 0.037 0.271 0.238 0.459 0.246 0.447 0.074 0.496 0.001 -0.007 0.001 -0.070 0.841 0.932 0.012 0.104 0.035 0.263 0.438 0.459 0.496 0.801 0.081 0.559 0.001 -0.007 0.001 -0.070 0.841 0.936 0.012 0.119 0.013 0.259 0.296 0.465 0.748 0.533 -0.021 0.594 0.001 -0.006 0.001 -0.065 0.836
Feature Extraction: Feature extraction method is transformative: that is we are applying transformation to our data to project it into new feature space with lower dimension. It’s main task is to select or combine the features that preserve most of the information and remove the redundant components in order to improve the efficiency of the subsequent classifiers without degrading their performances.
Technical Indicators: 1. Exponential Moving Average(EMA) 2. Moving Average Convergence-Divergence(MACD) 3. Relative Strength Index(RSI) 4. Stochastic Oscillator 5. Rate of Change(ROC) 6. Money Flow Index(MFI) 7. William %R 8. Accumulation Distribution Line(A/D) 9. On Balance Volume(OBV) 10.Chaikin Oscillator(CHO) 11.Average True Range 12.Average Directional Index(ADX) 13.Commodity Channel Index(CCI) 14.Chaikin Money Flow(CMF) 15.Percentage Price Oscillator(PPO) 16.Force Index(FI)
Feature Selection Technique: One of the essential feature of data mining is feature selection technique, this technique is mostly based on the machine learning for selection set of feature for improving the efficiency of the prediction. Feature selection techniques are used to automatically discover the best features and it helps to solve the problems of having too much data.
Rank based FST Feature Extraction Extracted Features Based on technical Indicators New feature space after applying FST EMA RSI SO ROC MFI %R A/D OBV CHO ATR ADX CCI CMF PPO FI MACDDATE OPEN CLOSE LOW VOLUME HIGH EMA RSI ROC %R CHO ATR CLOSE Initial feature Space
Framework for Stock Market Prediction Feature Extraction and Selection Training Testing EBPN RBFN Stock Data Feature Extraction New Stock Data With Extracted Features Data Normalization Feature Selection TechniqueRank Based Method Stock Data With reduced feature subset Data Partitioning ANN Model MAE RMSEMAPE Stock Prediction
Result and Analysis ANN Techniques No of Features selected MAPE RMSE MAE EBPN 16 5.514 0.036 0.026 13 6.085 0.036 0.026 11 6.110 0.036 0.028 10 5.979 0.034 0.025 7 5.485 0.343 0.025 RBNF 16 8.260 0.012 0.008 12 9.037 0.014 0.009 11 7.663 0.014 0.008 10 6.750 0.137 0.007 7 5.902 0.013 0.007
Comparative MAPE of EBPN and RBFN with reduced feature subset 0 1 2 3 4 5 6 7 8 9 10 16 13 11 10 7 EBPN RMSE MAPE Features
Comparative EBPN of Actual and Predicted value of stock market Comparative RBFN of Actual and Predicted value of stock market
Wavelet Technique  A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases, and then decreases back to zero.  Wavelet analysis is characterized by a wavelet.  The wavelet transform can provide information about both the time and frequency domains.
Types of Wavelet Transform  CWT(Continuous Wavelet Transform)  DWT(Discrete Wavelet Transform )  SWT(Stationary Wavelet transform)
Denoising of Stock Data using SWT  Time series data are very non linear and noisy by nature and these noisy data might degrade the quality of discovered pattern.  A MATLAB GUI tool is used to apply SWT for de- noising stock data.
Signal before preprocessing. Signal after preprocessing using SWT.
GUI for De-noising
Hybrid Model Feature selectionNew feature Subset subsetANN Model Prediction Stage Data Pre-Processing Stage Selection of type of WT:SWT Feature extraction MAPE RMSE MAE Stock Data (Normalized) Stock Prediction Select thresholding MethodChoosing level of decomposition De-noised signals
Result Analysis ANN Techniques No of Features selected MAPE RMSE MAE EBPN 16 2.737 0.030 0.008 14 2.742 0.030 0.016 13 2.618 0.034 0.015 11 4.355 0.042 0.025 10 2.644 0.035 0.016 9 4.154 0.040 0.024 8 4.058 0.036 0.022 RBNF 16 2.851 0.037 0.011 14 2.841 0.042 0.018 13 3.001 0.038 0.0175 11 2.640 0.034 0.015 10 3.257 0.042 0.019 9 3.993 0.048 0.024 8 2.796 0.039 0.017
0 1 2 3 4 5 6 7 13 11 10 7 Normalized data +ANN SWT + ANN MAPE comparison of EBPN and RBFN in selected Features of original data and de-noise data.
Conclusion  ANN based techniques learns the pattern by mapping input with corresponding output. If there are variations in input output pattern, ANN may not map pair of input output in better way.  In order to overcome this problem input pattern are required to be de-noise( Remove noise from the pattern).  Wavelet transform like SWT may be the best alternative for this. SWT is used to de-noise the data with 16 extracted features and the data are applied to ANN with ranking based features selection technique and the proposed hybrid of SWT and ANN produces comparative better result.  The outcome of the research work is as hybridization of SWT and ANN, where SWT is used for data smoothing and ANN is used for prediction of stock data.
References  Abhyankar, A., Copeland, L. S., and Wong, W. (1997). Uncovering nonlinear structure in real-time stockmarket indexes: The SandP 500, the DAX, the Nikkei 225, and the FTSE-100. Journal of Business and Economic Statistics, 15, 1–14.  Amelia Bilbao-Terol , Mar Arenas-Parra, Verónica Cañal-Fernández (2012), Selection of Socially Responsible Portfolios using Goal Programming and fuzzy Technology, Information Sciences ,Vol. 189 ,Pp.110–125.  Ashoka H N, Manjaiah D H, Rabindranath Bera, “Feature Extraction/Selection and Statistical Classification Technique for Character Recognition”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 5, May 2012,Pp.414-420.  Asadi S, Hadavandi E, Mehmanpazir F, Nakhostin MM(2012) Hybridization of evolutionary Levenberg Marquardt Neural Networks and data Preprocessing for stock market Prediction. Knowledge based system.  Bartosz Kozłowski, Time series denoising with wavelet transform,Journal of telecommunication and Information Technology(2005),Pp. 91-95  Boyaciaglu MA, Avci D (2010) An Adaptive Neural-Based Fuzzy Inference System(ANFIS) for the prediction of stock market return: The case of the Istanbul Stock Exchange. Expert system with Applications 37(12):7908-7912.  Broomhead, D. S. and Lowe D.(1988). Multivariable functional interpolation and adaptive networks. Complex Systems. 2 , 321-355.  Chih-Fong Tsai , Yu-Chieh Hsiao. Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches, Decision Support Systems, Volume 50, Issue 1, December 2010, Pp 258–269.
Thank you!

DEVELOPMENT OF INTELLIGENT PREDICTIVE MODEL FOR STOCK DATA PREDICTION WITH FEATURE EXTRACTION AND SELECTION”

  • 1.
  • 2.
    Contents:  Objective  Introduction Intelligent Techniques for prediction  ANN Techniques  Stock Data  Feature Extraction  Technical Indicators  Feature Selection  Framework for stock market prediction  Result and Analysis  Wavelet techniques  De-noising stock data using SWT  Proposed Model: Hybridization of SWT and ANN  Result and analysis  Conclusion  References
  • 3.
    Objective In this researchwork a framework is designed for an optimal stock data prediction to develop an intelligent decision support system. This developed system remove the non linearity that exist in financial time series data using some feature extraction and selection. For De-noising the data of extracted features SWT is used. These extracted de-noised features are apply to model of ANN and data mining techniques is used to get the accurate prediction of stock price.
  • 4.
    Introduction The stock marketis a complex and dynamic system with noisy, non-stationary and chaotic data series.  Prediction of a financial market is more challenging due to chaos and uncertainty of the system. Soft computing techniques are progressively gaining presence in the financial world. This research work describes the application of Artificial Neural Network (ANN) for the prediction of Stock Market using some technical indicators.. A new model with ANN and SWT is purposed with ranking based feature selection technique. Stationary Wavelet Transform(SWT) is used for de noising the data. Hybrid of SWT and ANN is used for stock market prediction for better accuracy.
  • 5.
    Intelligent Techniques forPrediction: Intelligent techniques for stock market prediction Artificial Intelligent Techniques(ANN) Wavelet techniques Hybrid Technology Continuous Wavelet Transform(CWT) Discrete Wavelet Transform(DWT) Stationary Wavelet Transform(SWT) ANN Wavelet Transform
  • 6.
  • 7.
    It is asupervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed- forward networks. Architecture of EBPN is given below: Error Back Propagation Network (EBPN):
  • 8.
    Radial Basis Function(RBF) Neural Network:  Radial basis functions are powerful techniques for interpolation in multidimensional space. A RBF is a function which has built into a distance criterion with respect to a center. Architecture of RBFN as given below:
  • 9.
    Stock Data  Thedata used in this study consist of BSE30 data collected from the historical data available on the website yahoo finance.  The actual data contains 6 features:  Date: The date of stock market data.  Open: The value of stock open on a particular date.  Close: The value of stock Close on a particular date.  Low: The lowest value of stock on a particular date.  High: The highest value of stock on a particular date.  Volume: Total number of units sold on a particular date.  This dataset encompasses five years data. The collected data is Non linear by nature, so preprocessing technique has been done to make the data smoother. For preprocessing of data some technical indicators are used suggested by some researchers.
  • 10.
    Sample of Data MAC D histro gram 10 days EMA RSID% ROC MFI %R close OBV AD CHO ATR ADX CCI PPO CMF 0.830 0.904 0.020 0.074 0.021 0.209 0.197 0.460 -0.469 0.041 -0.008 0.211 0.001 -0.008 0.000 -0.069 0.827 0.906 0.020 0.079 0.019 0.187 0.148 0.457 -0.710 0.057 0.009 0.207 0.001 -0.008 -0.001 -0.072 0.828 0.905 0.020 0.071 0.029 0.193 0.120 0.450 -0.471 0.074 0.000 0.248 0.001 -0.007 -0.001 -0.071 0.831 0.905 0.018 0.062 0.025 0.198 0.098 0.446 0.477 0.076 -0.004 0.284 0.001 -0.006 -0.001 -0.071 0.833 0.907 0.017 0.057 0.006 0.206 0.110 0.443 0.716 0.104 0.009 0.299 0.001 -0.004 -0.001 -0.072 0.833 0.916 0.018 0.062 0.023 0.202 0.183 0.445 0.472 0.212 0.027 0.329 0.001 -0.006 -0.002 -0.069 0.831 0.920 0.018 0.069 0.030 0.197 0.179 0.455 0.227 0.255 0.017 0.366 0.001 -0.007 -0.006 -0.067 0.831 0.925 0.015 0.080 0.051 0.228 0.211 0.455 0.474 0.155 -0.009 0.424 0.001 -0.007 0.004 -0.068 0.833 0.928 0.011 0.085 0.039 0.267 0.217 0.459 0.493 0.071 -0.010 0.467 0.001 -0.002 0.001 -0.065 0.838 0.928 0.013 0.100 0.037 0.271 0.238 0.459 0.246 0.447 0.074 0.496 0.001 -0.007 0.001 -0.070 0.841 0.932 0.012 0.104 0.035 0.263 0.438 0.459 0.496 0.801 0.081 0.559 0.001 -0.007 0.001 -0.070 0.841 0.936 0.012 0.119 0.013 0.259 0.296 0.465 0.748 0.533 -0.021 0.594 0.001 -0.006 0.001 -0.065 0.836
  • 11.
    Feature Extraction: Feature extractionmethod is transformative: that is we are applying transformation to our data to project it into new feature space with lower dimension. It’s main task is to select or combine the features that preserve most of the information and remove the redundant components in order to improve the efficiency of the subsequent classifiers without degrading their performances.
  • 12.
    Technical Indicators: 1. ExponentialMoving Average(EMA) 2. Moving Average Convergence-Divergence(MACD) 3. Relative Strength Index(RSI) 4. Stochastic Oscillator 5. Rate of Change(ROC) 6. Money Flow Index(MFI) 7. William %R 8. Accumulation Distribution Line(A/D) 9. On Balance Volume(OBV) 10.Chaikin Oscillator(CHO) 11.Average True Range 12.Average Directional Index(ADX) 13.Commodity Channel Index(CCI) 14.Chaikin Money Flow(CMF) 15.Percentage Price Oscillator(PPO) 16.Force Index(FI)
  • 13.
    Feature Selection Technique: Oneof the essential feature of data mining is feature selection technique, this technique is mostly based on the machine learning for selection set of feature for improving the efficiency of the prediction. Feature selection techniques are used to automatically discover the best features and it helps to solve the problems of having too much data.
  • 14.
    Rank based FST Feature Extraction Extracted Features Basedon technical Indicators New feature space after applying FST EMA RSI SO ROC MFI %R A/D OBV CHO ATR ADX CCI CMF PPO FI MACDDATE OPEN CLOSE LOW VOLUME HIGH EMA RSI ROC %R CHO ATR CLOSE Initial feature Space
  • 15.
    Framework for StockMarket Prediction Feature Extraction and Selection Training Testing EBPN RBFN Stock Data Feature Extraction New Stock Data With Extracted Features Data Normalization Feature Selection TechniqueRank Based Method Stock Data With reduced feature subset Data Partitioning ANN Model MAE RMSEMAPE Stock Prediction
  • 16.
    Result and Analysis ANNTechniques No of Features selected MAPE RMSE MAE EBPN 16 5.514 0.036 0.026 13 6.085 0.036 0.026 11 6.110 0.036 0.028 10 5.979 0.034 0.025 7 5.485 0.343 0.025 RBNF 16 8.260 0.012 0.008 12 9.037 0.014 0.009 11 7.663 0.014 0.008 10 6.750 0.137 0.007 7 5.902 0.013 0.007
  • 17.
    Comparative MAPE ofEBPN and RBFN with reduced feature subset 0 1 2 3 4 5 6 7 8 9 10 16 13 11 10 7 EBPN RMSE MAPE Features
  • 18.
    Comparative EBPN ofActual and Predicted value of stock market Comparative RBFN of Actual and Predicted value of stock market
  • 19.
    Wavelet Technique  Awavelet is a wave-like oscillation with an amplitude that begins at zero, increases, and then decreases back to zero.  Wavelet analysis is characterized by a wavelet.  The wavelet transform can provide information about both the time and frequency domains.
  • 20.
    Types of WaveletTransform  CWT(Continuous Wavelet Transform)  DWT(Discrete Wavelet Transform )  SWT(Stationary Wavelet transform)
  • 21.
    Denoising of StockData using SWT  Time series data are very non linear and noisy by nature and these noisy data might degrade the quality of discovered pattern.  A MATLAB GUI tool is used to apply SWT for de- noising stock data.
  • 22.
    Signal before preprocessing. Signalafter preprocessing using SWT.
  • 23.
  • 26.
    Hybrid Model Feature selectionNewfeature Subset subsetANN Model Prediction Stage Data Pre-Processing Stage Selection of type of WT:SWT Feature extraction MAPE RMSE MAE Stock Data (Normalized) Stock Prediction Select thresholding MethodChoosing level of decomposition De-noised signals
  • 27.
    Result Analysis ANN TechniquesNo of Features selected MAPE RMSE MAE EBPN 16 2.737 0.030 0.008 14 2.742 0.030 0.016 13 2.618 0.034 0.015 11 4.355 0.042 0.025 10 2.644 0.035 0.016 9 4.154 0.040 0.024 8 4.058 0.036 0.022 RBNF 16 2.851 0.037 0.011 14 2.841 0.042 0.018 13 3.001 0.038 0.0175 11 2.640 0.034 0.015 10 3.257 0.042 0.019 9 3.993 0.048 0.024 8 2.796 0.039 0.017
  • 28.
    0 1 2 3 4 5 6 7 13 11 107 Normalized data +ANN SWT + ANN MAPE comparison of EBPN and RBFN in selected Features of original data and de-noise data.
  • 29.
    Conclusion  ANN basedtechniques learns the pattern by mapping input with corresponding output. If there are variations in input output pattern, ANN may not map pair of input output in better way.  In order to overcome this problem input pattern are required to be de-noise( Remove noise from the pattern).  Wavelet transform like SWT may be the best alternative for this. SWT is used to de-noise the data with 16 extracted features and the data are applied to ANN with ranking based features selection technique and the proposed hybrid of SWT and ANN produces comparative better result.  The outcome of the research work is as hybridization of SWT and ANN, where SWT is used for data smoothing and ANN is used for prediction of stock data.
  • 30.
    References  Abhyankar, A.,Copeland, L. S., and Wong, W. (1997). Uncovering nonlinear structure in real-time stockmarket indexes: The SandP 500, the DAX, the Nikkei 225, and the FTSE-100. Journal of Business and Economic Statistics, 15, 1–14.  Amelia Bilbao-Terol , Mar Arenas-Parra, Verónica Cañal-Fernández (2012), Selection of Socially Responsible Portfolios using Goal Programming and fuzzy Technology, Information Sciences ,Vol. 189 ,Pp.110–125.  Ashoka H N, Manjaiah D H, Rabindranath Bera, “Feature Extraction/Selection and Statistical Classification Technique for Character Recognition”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 5, May 2012,Pp.414-420.  Asadi S, Hadavandi E, Mehmanpazir F, Nakhostin MM(2012) Hybridization of evolutionary Levenberg Marquardt Neural Networks and data Preprocessing for stock market Prediction. Knowledge based system.  Bartosz Kozłowski, Time series denoising with wavelet transform,Journal of telecommunication and Information Technology(2005),Pp. 91-95  Boyaciaglu MA, Avci D (2010) An Adaptive Neural-Based Fuzzy Inference System(ANFIS) for the prediction of stock market return: The case of the Istanbul Stock Exchange. Expert system with Applications 37(12):7908-7912.  Broomhead, D. S. and Lowe D.(1988). Multivariable functional interpolation and adaptive networks. Complex Systems. 2 , 321-355.  Chih-Fong Tsai , Yu-Chieh Hsiao. Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches, Decision Support Systems, Volume 50, Issue 1, December 2010, Pp 258–269.
  • 31.