ANALYSIS OF MACHINE LEARNING MODELS BY SOLVING THE TEXT DATA CLASSIFICATION PROBLEM

A. V. Pchelin, N. A. Kononov, V. S. Serova, E. V. Bunova, A. D. Marchenko, A. E. Shevchenko

Abstract


The article presents a study of usage of machine-learning models for the classification of text data on the example of the problem of classifying requests to technical support through a chat bot of a mobile application. The following methods were considered: Naive Bayes classifier, K-Nearest Neighbors algorithm (KNN)), Decision Tree, Random Forest, Support Vector Machines (SVM) and the method of Logistic Regression (Logistic Regression), as well as 21 models based on above methods. The best machine-learning model for classifying text requests to the technical support chat bot turned out to be a model, based on the Logistic Regression method, and model, based on the Random Forest Classifier. The Complement Naive Bayes model of the Naive Bayes group of models showed the shortest tuning time among the trained models with an acceptable accuracy. The proposed methodology can be used to analyze and classify text data.

Keywords


text classification; machine learning methods; regression; natural language; text data analysis.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


 Save