Sentiment Analysis of Texts in the Kazakh Language Using Machine Learning and Rule-based Methods
Loading...
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Engineering and Natural Science
Abstract
This thesis explores the effectiveness of rule-based and machine learning (ML) approaches in the sentiment analysis of Kazakh language texts across diverse domains in natural language processing (NLP). Recognizing the complexities and limited resources available for Kazakh, this research incorporates edit distance algorithms to refine rule-based methods, enhancing their accuracy and robustness. The study undertakes a comparative exploration between these methodologies, aiming to comprehensively assess and compare their performance, effectiveness, and adaptability. Diverse datasets, including daily news, Kazakh literary texts, and Amazon product reviews, have been utilized to delineate the strengths and weaknesses of both rule-based and machine learning approaches. By employing metrics such as accuracy, precision, recall, and computational efficiency, a detailed evaluation has been conducted. It has been found that subtle emotional nuances in literary texts are effectively captured by rule-based methods, while the adaptability of ML models to the varied linguistic elements commonly found in news and consumer reviews is confirmed. However, significant limitations in the rule-based approach have been exposed, revealing the superior scalability and adaptability of ML models when applied across different datasets. The extension of edit distance application to comprehensive sentence analyses and the integration with ML techniques to develop hybrid models are anticipated as future research directions. These advancements are expected to improve both the precision and the applicability of sentiment analysis tools, benefiting not only Kazakh but also other languages with complex linguistic structures. This work has advanced NLP capabilities for underrepresented languages and has promoted the development of more inclusive language processing tools.
Description
Keywords
Citation
Amirkumar M / Sentiment Analysis of Texts in the Kazakh Language Using Machine Learning and Rule-based Methods / 2024 /