Using support vector machine with a hybrid feature selection method to the stock trend prediction Ming-Chi Lee Expert Systems with Applications . 2009 Presenter: Yu Hsiang Huang Date: 2012-05-17 1
Outline • Introduction • Feature selection • Research design • Experimental results and analysis • Conclusion 2
Introduction • Stock market – Highly nonlinear dynamic system • Application of AI – Expert system , Fuzzy system, Neuron network – Back propagation neural network (BPNN) • Power of prediction is better than the others • Require a large amount of training data to estimate the distribution of input pattern • Over-fitting nature • Fully depends on researcher’s experience of knowledge to preprocess data – relevant input variables, hidden layer size, learning rate, momentum, etc. 3
Introduction • In this paper – Support vector machine (SVM) • Captures geometric characteristics of feature space without deriving weights of networks from the training data. • Extracts the optimal solution with the small training set size • Local optimal solution vs. Global optimum solution • No over-fitting • Classification performance is influenced by dimension or number of feature variables – Feature selection • Addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification • Hybrid feature selection : Filter method + wrapper method  F_SSFS • F_SSFS : F-score + Supported sequential forward search • Optimal parameter search – Compare performance between BP and SVM 4
SVM-based model with F_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 5
Feature selection • Filter method : – No feed back from classifier – Estimate the classification performance by some indirect assessments • Distance : reflect how well the classes separate from each other No feedback from classifier Estimate the classification performance : distance 6
Feature selection 7
Feature selection • F-score and Supported Sequential Forward Search (F_SSFS) – F-score Original feature variables Calculate F-score Sort F-score Select top K F-score feature 8
SVM-based model with F_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 9
Feature selection • Wrapper method: – Classifier-dependent • Evaluate the “goodness” of the selected feature subset directly (from classifier) • Should intuitively yield better performance – Have limit applications • Due to the high computational complexity involved Feedback from classifier 10
Feature selection • F-score and Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) • Play the role of wrapper • A variation of the sequential forward search (SFS) algorithm that is specially tailored to SVM to expedite the feature searching process • Support vector : training samples other than support vectors have no contribution to determine the decision boundary • Dynamically maintains an active subset as the candidates of the support vector • Training SVM using reduced subset rather than the entire training set - less computational cost 11
Feature selection • F-score and Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) f1 f2 f3 f4 … fk-2 fk-1 fk label r1 … … … … … … … … + r2 … … … … … … … … - … … … … … … … … … - rN … … … … … … … … + 12
Feature selection • F-score and Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) Iteration = 1 Iteration = n+1 Termination 13
Feature selection • F-score and Supported Sequential Forward Search (F_SSFS) – F_SSFS • Uses the F-score measure to decide the best feature subsets • Uses the SSFS algorithm to select the final best feature subsets • Reduces the number of features that has to be tested through the training of SVM • Reduces the unnecessary computation time spent on the testing of the “no-informative” features by wrapper method 14
Research design • Data collection and preprocessing – Prediction target : the direction of change in the daily NASDAQ index – Index futures lead the spot index – Using 30 technical indices as the whole features set – 20 future contracts, 9 spot indexes and 1-day lagged NASDAQ Index – Use “1” and “-1” to denote the next day’s index is higher or lower than today’s – From Nov 8, 2001 to Nov 8, 2007 with 1065 observations per feature – The original data are scaled into the range of (0,1) f1 f2 f3 … … f28 f29 f30 label 1 … … … … … … … … 1 2 … … … … … … … … -1 … … … … … … … … … -1 1065 … … … … … … … … 1 15
Research design 16
Research design 17
SVM-based model with F_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected K features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 18
Experimental results and analysis • Experimental result of F_SSFS – Threshold K determines how many features we want to keep after filtering. • K is equal to the number of all original features  filter part does not contribute at all • K is equal to 1  the wrapper method is unnecessary 19
Experimental results and analysis 20
Experimental results and analysis • Experimental result of F_SSFS – wrapper part – Choose K = 22, after the process of wrapper part – 17 features are left, average accuracy rate 81.7% 21
Experimental results and analysis 22
Experimental results and analysis • Experimental result of SVM • Experimental result of BPNN 23
Experimental results and analysis • Experimental result of feature selection – Key deficiency of neural-network models for stock trend prediction • Difficulty in selecting the discriminative features and explaining the rationale for the stock trend prediction – Relative importance of each feature 24
Experimental results and analysis • Conclusion – Stock trend prediction – Support vector machine with hybrid feature selection method (F_SSFS) – Reducing high computational cost and the risk of over-fitting – Need to investigate to develop the optimal value of the parameters in SVM for the best prediction performance – Generalization of SVM on the basis of the appropriate level of the training set size and give a guideline to measure the generalization performance 25

Using support vector machine with a hybrid feature selection method to the stock trend prediction

  • 1.
    Using support vectormachine with a hybrid feature selection method to the stock trend prediction Ming-Chi Lee Expert Systems with Applications . 2009 Presenter: Yu Hsiang Huang Date: 2012-05-17 1
  • 2.
    Outline • Introduction • Feature selection • Research design • Experimental results and analysis • Conclusion 2
  • 3.
    Introduction • Stock market – Highly nonlinear dynamic system • Application of AI – Expert system , Fuzzy system, Neuron network – Back propagation neural network (BPNN) • Power of prediction is better than the others • Require a large amount of training data to estimate the distribution of input pattern • Over-fitting nature • Fully depends on researcher’s experience of knowledge to preprocess data – relevant input variables, hidden layer size, learning rate, momentum, etc. 3
  • 4.
    Introduction • In thispaper – Support vector machine (SVM) • Captures geometric characteristics of feature space without deriving weights of networks from the training data. • Extracts the optimal solution with the small training set size • Local optimal solution vs. Global optimum solution • No over-fitting • Classification performance is influenced by dimension or number of feature variables – Feature selection • Addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification • Hybrid feature selection : Filter method + wrapper method  F_SSFS • F_SSFS : F-score + Supported sequential forward search • Optimal parameter search – Compare performance between BP and SVM 4
  • 5.
    SVM-based model withF_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 5
  • 6.
    Feature selection • Filtermethod : – No feed back from classifier – Estimate the classification performance by some indirect assessments • Distance : reflect how well the classes separate from each other No feedback from classifier Estimate the classification performance : distance 6
  • 7.
  • 8.
    Feature selection • F-scoreand Supported Sequential Forward Search (F_SSFS) – F-score Original feature variables Calculate F-score Sort F-score Select top K F-score feature 8
  • 9.
    SVM-based model withF_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 9
  • 10.
    Feature selection • Wrappermethod: – Classifier-dependent • Evaluate the “goodness” of the selected feature subset directly (from classifier) • Should intuitively yield better performance – Have limit applications • Due to the high computational complexity involved Feedback from classifier 10
  • 11.
    Feature selection • F-scoreand Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) • Play the role of wrapper • A variation of the sequential forward search (SFS) algorithm that is specially tailored to SVM to expedite the feature searching process • Support vector : training samples other than support vectors have no contribution to determine the decision boundary • Dynamically maintains an active subset as the candidates of the support vector • Training SVM using reduced subset rather than the entire training set - less computational cost 11
  • 12.
    Feature selection • F-scoreand Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) f1 f2 f3 f4 … fk-2 fk-1 fk label r1 … … … … … … … … + r2 … … … … … … … … - … … … … … … … … … - rN … … … … … … … … + 12
  • 13.
    Feature selection • F-scoreand Supported Sequential Forward Search (F_SSFS) – Supported sequential forward search (SSFS) Iteration = 1 Iteration = n+1 Termination 13
  • 14.
    Feature selection • F-scoreand Supported Sequential Forward Search (F_SSFS) – F_SSFS • Uses the F-score measure to decide the best feature subsets • Uses the SSFS algorithm to select the final best feature subsets • Reduces the number of features that has to be tested through the training of SVM • Reduces the unnecessary computation time spent on the testing of the “no-informative” features by wrapper method 14
  • 15.
    Research design • Datacollection and preprocessing – Prediction target : the direction of change in the daily NASDAQ index – Index futures lead the spot index – Using 30 technical indices as the whole features set – 20 future contracts, 9 spot indexes and 1-day lagged NASDAQ Index – Use “1” and “-1” to denote the next day’s index is higher or lower than today’s – From Nov 8, 2001 to Nov 8, 2007 with 1065 observations per feature – The original data are scaled into the range of (0,1) f1 f2 f3 … … f28 f29 f30 label 1 … … … … … … … … 1 2 … … … … … … … … -1 … … … … … … … … … -1 1065 … … … … … … … … 1 15
  • 16.
  • 17.
  • 18.
    SVM-based model withF_SSFS Original feature variables Hybrid feature selection Filter part Feature pruning using F-score Pre-selected K features Wrapper part SSFS algorithm find best feature variables Best Feature variables Data SVM Training , testing , evaluating the classification accuracy 18
  • 19.
    Experimental results andanalysis • Experimental result of F_SSFS – Threshold K determines how many features we want to keep after filtering. • K is equal to the number of all original features  filter part does not contribute at all • K is equal to 1  the wrapper method is unnecessary 19
  • 20.
  • 21.
    Experimental results andanalysis • Experimental result of F_SSFS – wrapper part – Choose K = 22, after the process of wrapper part – 17 features are left, average accuracy rate 81.7% 21
  • 22.
  • 23.
    Experimental results andanalysis • Experimental result of SVM • Experimental result of BPNN 23
  • 24.
    Experimental results andanalysis • Experimental result of feature selection – Key deficiency of neural-network models for stock trend prediction • Difficulty in selecting the discriminative features and explaining the rationale for the stock trend prediction – Relative importance of each feature 24
  • 25.
    Experimental results andanalysis • Conclusion – Stock trend prediction – Support vector machine with hybrid feature selection method (F_SSFS) – Reducing high computational cost and the risk of over-fitting – Need to investigate to develop the optimal value of the parameters in SVM for the best prediction performance – Generalization of SVM on the basis of the appropriate level of the training set size and give a guideline to measure the generalization performance 25