| ชื่อเรื่อง | : | Text Categorization for Intellectual Property: Comparing Balanced Winnow with SVM on Di#11;fferent Document Representations |
| นักวิจัย | : | Beuls, Katrien M.B. |
| คำค้น | : | Text categorization , SVM , Winnow , Patent documents |
| หน่วยงาน | : | Edinburgh Research Archive, United Kingdom |
| ผู้ร่วมงาน | : | Hanbury, Allan , Clark, Rob |
| ปีพิมพ์ | : | 2552 |
| อ้างอิง | : | http://hdl.handle.net/1842/3612 |
| ที่มา | : | - |
| ความเชี่ยวชาญ | : | - |
| ความสัมพันธ์ | : | Joachims, T. (2002). Learning to Classify Text using Support Vector Machines. Kluwer. , Koster, C. and Beney, J. (2007). On the importance of parameter tuning in text categorization. Perspectives of Systems Informatics, pages 270-283. , Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34:1-47. |
| ขอบเขตของเนื้อหา | : | - |
| บทคัดย่อ/คำอธิบาย | : | This study investigates the effect of training different categorization algorithms on various patent document representations. The automation of knowledge and content management in the intellectual property domain has been experiencing a growing interest in the last decade [Cai and Hofmann, 2004, Fall et al., 2003, Koster et al., 2003, Krier and Zacca, 2002],, since the first patent classification system was presented in 1999 by Larkey [Larkey, 1999]. Typical applications of patent classification systems are: (1) the automatic assignment of a new patent to the group of patent examiners concerned with the topic, (2) the search for prior art in fields similar to the incoming patent application and (3) the reclassification of patent specifications. By means of machine learning techniques, a collection of 1 270 185 patents is used to build a classifier that is able to classify documents with varyingly large feature spaces. The two algorithms that are compared are Balanced Winnow and Support Vector Machines (SVMs). A previous study [Zhang, 2000] found that Winnow achieves a similar accuracy to SVM but it is much faster as the execution time for Winnow is linear in the number of terms and the number of classes. This primary finding is verified on a feature space 100 times the size using patent documents instead of news paper articles. Results show that SVM outperforms Winnow considerably on all considered measures. Moreover, SVM is found to be a much more robust classifier than Winnow. The parameter tuning that was carried out for both algorithms confirms this result. |
| บรรณานุกรม | : |
Beuls, Katrien M.B. . (2552). Text Categorization for Intellectual Property: Comparing Balanced Winnow with SVM on Di#11;fferent Document Representations.
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Beuls, Katrien M.B. . 2552. "Text Categorization for Intellectual Property: Comparing Balanced Winnow with SVM on Di#11;fferent Document Representations".
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Beuls, Katrien M.B. . "Text Categorization for Intellectual Property: Comparing Balanced Winnow with SVM on Di#11;fferent Document Representations."
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom , 2552. Print. Beuls, Katrien M.B. . Text Categorization for Intellectual Property: Comparing Balanced Winnow with SVM on Di#11;fferent Document Representations. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom ; 2552.
|
