SVMLight - University of Illinois at Chicago

SVMLight - University of Illinois at Chicago

SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from : http://svmlight.joachims.org/ Detailed description about: What are the features of SVMLight? How to install it? How to use it? Training Step svm-learn [-option] train_file model_file train_file contains training data; The filename of train_file can be any filename; The extension of train_file can be defined by user arbitrarily; model_file contains the model built based on training data by SVM;

Format of input file (training data) For text classification, training data is a collection of documents; Each line represents a document; Each feature represents a term (word) in the document; The label and each of the feature: value pairs are separated by a space character Feature: value pairs MUST be ordered by increasing feature number Feature value : e.g., tf-idf; Testing Step svm-classify test_file model_file predictions The format of test_file is exactly the same as train_file; Needs to be scaled into same range;

We use the model built based on training data to classify test data, and compare the predictions with the original label of each test document; Example In test_file, we have: 1 101:0.2 205:4 209:0.2 304:0.2 -1 202:0.1 203:0.1 208:0.1 209:0.3 After running the svm_classify, the Predictions may be: 1.045 -0.987

Which means this classifier classify these two documents Correctly. or 1.045 0.987 Which means the first document is classified correctly but the second one is incorrectly. Confusion Matrix a is the number of correct predictions that an instance is negative;

b is the number of incorrect predictions that an instance is positive; c is the number of incorrect predictions that an instance if negative; d is the number of correct predictions that an instance is positive; Predicted Actual negative positive negative a b

positive c d Evaluations of Performance Accuracy (AC) is the proportion of the total number of predictions that were correct. AC = (a + d) / (a + b + c + d) Recall is the proportion of positive cases that were correctly identified. Actual positive cases number R = d / (c + d) Precision is the proportion of the predicted positive cases that were correct.

predicted positive cases number P = d / (b + d) Example P r e d ic te d : A ctu a l T e st C a se s: 530 550 "+" 20 50 4 5 0 " -" For this classifier:

a = 400 b = 50 c = 20 d = 530 400 Accuracy = (400 + 530) / 1000 = 93% Precision = d / (b + d) = 530 / 580 = 91.4% Recall = d / (c + d) = 530 / 550 = 96.4%

Recently Viewed Presentations

  • The Value Of The Valley (2)

    The Value Of The Valley (2)

    Joshua 6 * There are some lessons to be learned, that are learned only in the valley. There are visions to be seen, that are seen only in the valley. There are some strengths to be gained, that are only...
  • The Beginnings of Human History - Broken Arrow Public Schools

    The Beginnings of Human History - Broken Arrow Public Schools

    The ancient Greeks had two ways of thinking about the truth. They called them by different terms: logos and mythos. Logos meant the kind of truth that can be found through argument and demonstrations. You can see the word logos...
  • Be VOCAL: How to Be a Successful Online Instructor

    Be VOCAL: How to Be a Successful Online Instructor

    The ability of the teacher to effectively infuse these characteristics into their instructional practice- to be VOCAL- will promote a supportive, challenging, constructive, rigorous, and effective instructional environment. Savery, J.R. (2005). BE VOCAL: Characteristics of Successful Online Instructors. Journal of...
  • Planning for the digital natives NNECAPA annual conference

    Planning for the digital natives NNECAPA annual conference

    Published last week Authors work for Berkman Center for Internet and Society at Harvard U Should be an interesting read 8 years old: That's when the dot com bubble burst 10 years old: Apple releases the first viable portable music...
  • The Cerebellum

    The Cerebellum

    The Cerebellum Position Lies above and behind the medullar and pons and occupies posterior cranial fossa External features Consists of two cerebellar hemisphere united in the midline by the vermis External features Three peduncles Inferior cerebellar peduncle -connect with medulla...
  • Online Learning and Education

    Online Learning and Education

    WorldatWork. WorldatWork is a nonprofit human resources association and the compensation authority for professionals and organizations focused on compensation, benefits and total rewards. It's our mission to empower professionals to become masters in their fields. We do so by providing...
  • Final Jeopardy Fractions Addition Fractions Subtraction  Fractions Multiplication

    Final Jeopardy Fractions Addition Fractions Subtraction Fractions Multiplication

    How many pounds of candy will be in each bag? There will be 2 ⅓ pounds in each bag. Charlie, Dan, and Ellie each made a paper chain. Charlie's was 9 ½ feet long, Dan's was 6 ⅓ feet long,...
  • Vocabulary Workshop - Augusta County Public Schools

    Vocabulary Workshop - Augusta County Public Schools

    Alien (n) a citizen of another country;(adj) foreign, strange Synonyms: (adj) exotic, unfamiliar. Antonyms: (adj) native, endemic, familiar