Application of the Polyhedral Conic Functions Method in the Text Classification and Comparative Analysis
Abstract
In direct proportion to the heavy increase of online information data, the attention to text categorization (classification) has also increased. In text categorization problem, namely, text classification, the goal is to classify the documents into predefined classes (categories or labels). Recently various methods in data mining have been experienced for text classification in literature except polyhedral conic function (PCF) methods. In this paper, PCFs are used to classify the documents. The separation algorithms via PCFs which include linear programming subproblems with inequality constraints are presented. Numerical experiments are done on real-world text datasets. Comparisons are made between state-of-the-art methods by presenting obtained tenfold cross-validation results, accuracy values, and running times in tables. The results verify that in text classification PCF methods are as effective in terms of accuracy values as state-of-the-art methods.