SVMC – SVM Classification models 
< Prev  Next > 

Dialog window for SVM – Classification
Example of graphical output
SVMC – SVM Classification models
SVM models minimize suitably defined error (misclassification rate in classification or deviation in some metric in regression). For example, in classification of a linearly separable task in two dimensions (with two independent numerical variables and one twolevel factor response variable defining one of two classes such as “A” and “B” for each value) we look for a line which separates (discriminates) both classes and allows for maximal distance of the different classes from the separating line thus generally minimizing risk of misclassification for any new data, see next figure. The SVM model can then be used to predict the class from a given set of independent variable values including probabilities for each class.
In a nonseparable case (like that on next figure), a line is sought that minimizes the misclassification “distance” of misclassified (or incorrectly classified) data. On next figure the separating line minimizes the sum of distances of incorrectly classified point “A” and one incorrectly classified point “B” from the separating line and maximizes distance of the correctly classified data from the separating line. In case of separable data with binary response (y_{i} = –1 or 1) the length of the normal vector w to the separation line (or generally separation hyper plane) is minimized: subject to which maximizes the width of the gap between the two classes (lines H1 and H2 in next figure). In the case of nonseparable data, a term for misclass penalization with a userdefined tuning “cost” parameter C is added. subject to Geometrical interpretation of nonseparable case is illustrated on next figure. The points that lie on (or “support”) the separation zone lines H_{1} and H_{2} are called “support vectors” – hence the name of the whole method. The support vectors are circled on next figure. Alternatively, instead of the loss coefficient C, a ratio ν (0 < ν < 1) may be employed, subject to ν corresponds to an expected ratio of misclassified cases. 

Last Updated ( 31.05.2013 ) 
< Prev  Next > 
