FAST AND RELATIVELY ACCURATE SENTIMENT ANALYSIS FOR THE KAZAKH LANGUAGE

No Thumbnail Available

Date

2022

Journal Title

Journal ISSN

Volume Title

Publisher

2022 International Young Scholars' Conference

Abstract

Abstract This paper constructs a fast and accurate sentiment analysis model for the Kazakh language. The main method for text classification is based on TF-IDF-based tokens trained with Logistic Regression. The processing and modeling stages are fully implemented in the PySpark framework. The proposed method has shown an accuracy level of 82% on an evenly distributed test dataset. As a byproduct of the work, we have collected a list of words in the Kazakh language that could signal the negativity/positivity of the given review.

Description

Keywords

sentiment analysis, natural language processing, Kazakh language, 2022 International Young Scholars' Conference, №11

Citation

N. Manteyeva / FAST AND RELATIVELY ACCURATE SENTIMENT ANALYSIS FOR THE KAZAKH LANGUAGE / 2022 International Young Scholars' Conference