STEMMING OF KAZAKH LANGUAGE
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
Абай атындағы ҚазҰПУ-нің ХАБАРШЫСЫ, «Физика-математика ғылымдары» сериясы, №1(73)
Abstract
Nowadays natural language processing is widely used. For instance, it can be used to translate text, in search engines systems, text topic identification. Such applications require preprocessing of text. It should be done, because preprocessing of text can influence on system accuracy. Text preprocessing can be done by several ways. One approach is identifying root of word. Advantage of identifying root of word is that it can save memory of computer, because repeated roots will be saved one time. This paper describes stemming systems, which can identify root of word. In literature review part authors reviewed to stemming algorithms, which can identify roots of words of Russian, Uzbek, Turkish languages. Then authors proposed stemming system, which can identify root of word of Kazakh language. In current paper authors describe how their system works. To test the system words from various parts of speech were entered. Proposed system can identify roots of noun, verb, adjective, numeral words. The system response can be seen in table 1. Pictures below show what kinds of suffixes, endings can be concatenated with root of word of Kazakh language. However not all combinations are shown in pictures. In conclusion part advices for how to develop stemming system are written.
Description
Keywords
stemming, morphology, parts of speech
Citation
Bogdanchikov A , Baimuratov O.A , Ayazbayev D.A / STEMMING OF KAZAKH LANGUAGE / Абай атындағы ҚазҰПУ-нің ХАБАРШЫСЫ, «Физика-математика ғылымдары» сериясы, №1(73) / 2021