SDU Repository :: Browsing by Author "Bogdanchikov A."

Browsing by Author "Bogdanchikov A."

Now showing 1 - 13 of 13

Open Access
A novel recommender system for adapting single machine problems to distributed systems within MapReduce
(Bulletin of Electrical Engineering and Informatics, 2024) Orynbekova K.; Kadyrov Sh.; Bogdanchikov A.; Oktamov S.
This research introduces a novel recommender system for adapting singlemachine problems to distributed systems within the MapReduce (MR) framework, integrating knowledge and text-based approaches. Categorizing common problems by five MR categories, the study develops and tests a tutorial with promising results. Expanding the dataset, machine learning models recommend solutions for distributed systems. Results demonstrate the logistic regression model's effectiveness, with a hybrid approach showing adaptability. The study contributes to advancing the adaptation of single-machine problems to distributed systems MR, presenting a novel framework for tailored recommendations, thereby enhancing scalability and efficiency in data processing workflows. Additionally, it fosters innovation in distributed computing paradigms.
Open Access
Classification of Scientific Documents in the Kazakh Language Using Deep Neural Networks and a Fusion of Images and Text
(MDPI, 2022) Bogdanchikov A.; Ayazbayev D.; Varlamis I.
The rapid development of natural language processing and deep learning techniques has boosted the performance of related algorithms in several linguistic and text mining tasks. Consequently, applications such as opinion mining, fake news detection or document classification that assign documents to predefined categories have significantly benefited from pre-trained language models, word or sentence embeddings, linguistic corpora, knowledge graphs and other resources that are in abundance for the more popular languages (e.g., English, Chinese, etc.). Less represented languages, such as the Kazakh language, balkan languages, etc., still lack the necessary linguistic resources and thus the performance of the respective methods is still low. In this work, we develop a model that classifies scientific papers written in the Kazakh language using both text and image information and demonstrate that this fusion of information can be beneficial for cases of languages that have limited resources for machine learning models’ training. With this fusion, we improve the classification accuracy by 4.4499% compared to the models that use only text or only image information. The successful use of the proposed method in scientific documents’ classification paves the way for more complex classification models and more application in other domains such as news classification, sentiment analysis, etc., in the Kazakh language.
Open Access
Defining Semantically Close Words of Kazakh Language with Distributed System Apache Spark
(MDPI, 2023) Ayazbayev D.; Bogdanchikov A.; Orynbekova K.; Varlamis I.
This work focuses on determining semantically close words and using semantic similarity in general in order to improve performance in information retrieval tasks. The semantic similarity of words is an important task with many applications from information retrieval to spell checking or even document clustering and classification. Although, in languages with rich linguistic resources, the methods and tools for this task are well established, some languages do not have such tools. The first step in our experiment is to represent the words in a collection in a vector form and then define the semantic similarity of the terms using a vector similarity method. In order to tame the complexity of the task, which relies on the number of word (and, consequently, of the vector) pairs that have to be combined in order to define the semantically closest word pairs, A distributed method that runs on Apache Spark is designed to reduce the calculation time by running comparison tasks in parallel. Three alternative implementations are proposed and tested using a list of target words and seeking the most semantically similar words from a lexicon for each one of them. In a second step, we employ pre-trained multilingual sentence transformers to capture the content semantics at a sentence level and a vector-based semantic index to accelerate the searches. The code is written in MapReduce, and the experiments and results show that the proposed methods can provide an interesting solution for finding similar words or texts in the Kazakh language.
Open Access
Face extraction and recognition from public images using HIPI
(ResearchGate, 2018) Bogdanchikov A.; Kariboz D.; Meraliyev M.
Social networking services with public data are widely used nowadays. Billions of images uploaded to the internet each day over the world. This paper proposes the idea of a system which is currently being developed. The system collects images from public sources by some specific criteria, applies face detection and recognition algorithms on collected images, and provides the results in readable form. The Apache Hadoop library is used to increase performance of the system, and images are downloaded by the help of HIPI library. For testing purposes Labeled Faces in the Wild benchmark is used as a database for images containing over 13233 images of 5749 identities. Each image is jpg format in 250*250 resolutions. Results were more than good after testing the system with this database for face detection and recognition.
Open Access
FORECASTING OIL PRODUCTION USING LSTM NETWORKS CONFINED TO DECLINE
(СДУ хабаршысы - 2020, 2020) Zhumekeshov A. ; Bogdanchikov A.
Abstract. Natural resources are limited and very important in our industrial life and development. Oil is considered as the black gold and it is included in hundreds of industrial fields. Therefore, forecasting future oil production performance is an important aspect for oil industry. In this study, we proposed improvements to the existing deep learning model in order to overcome limitations associated with the original model. For evaluation purpose, proposed and original deep learning models were applied on a real case oil production data. The empirical results show that the proposed adjustments to the existing deep learning model achieves better forecasting accuracy.
Open Access
INTRODUCE PREDICTIVE ANALYTICS USING THE NEXT BEST ACTION (NBA) MODELS INTO THE BANKING SYSTEM
(СДУ хабаршысы - 2020, 2020) Dauylov S. ; Bogdanchikov A.
Abstract. NBA - is an approach in which each client is initially considered purely individual. It has a close correlation with Predictive analysis. Predictive or prognostic analytics is a set of techniques and methods for analyzing data to build a forecast of future events. The banking system is currently using the method to obtained certain business results from its customers and has increased loyalty, increased income, found new growth points, etc. The classical model of marketing was rather different, it repelled from its existing product line and its parameters. But new models repels from customer’s inclination to purchase a particular product. The aim of this project is to investigate the field of deposit accounts of the banking system by using NBA approach and to show the benefits and possible outcomes. This approach was tested on various aspects of the banking system and showed a number of solutions which can predict the probability of a customer to create a term deposit account.
Open Access
KAZAKH NAMES GENERATOR USING DEEP LEARNING
(ВЕСТНИК КАЗАХСТАНСКО-БРИТАНСКОГО ТЕХНИЧЕСКОГО УНИВЕРСИТЕТА, №4, 2020) Nurmambetov D.; Dauylov S.; Bogdanchikov A.
In recent years, sentiment analysis of e-mail messages or social media posts is becoming very popular. It can help people define if they are reading something positive or negative. On the same time, there are some services on the Internet that can help you find or create a new name. When processing the creation, they check the name in other popular languages, so your name does not mean inappropriate things in other languages. For this they bill for 25 thousand US dollars. If there are such services, then there is a demand. In this study, sentiment analysis of e-mails was implemented with using StanfordNLP [1] lemmatizer and classic machine learning algorithms as a classifier. It is applied to real e-mails from Russian speaking mailbox, which means there are both English and Russian messages. Thus, language identification is also added as preprocessing step. In this study only binary sentiment analysis was made, but it can be improved with adding several emotions to be detected. Then another model generates Kazakh names using neural networks, where all Kazakh names data has been collected through various websites. The sentiment analysis model gives 81% accuracy and the joint use of two models allow us to generate new Kazakh names, which are checked with Russian language if they mean something inappropriate. The result can be improved with checking with other languages.
Open Access
MapReduce Solutions Classification by Their Implementation
(The International Journal of Engineering Pedagogy (iJEP), 2023) Orynbekova K.; Bogdanchikov A.; Cankurt S.; Adamov A.; Kadyrov Sh.
Distributed Systems are widely used in industrial projects and scientific research. The Apache Hadoop environment, which works on the MapReduce paradigm, lost popularity because new, modern tools were developed. For example, Apache Spark is preferred in some cases since it uses RAM resources to hold intermediate calculations; therefore, it works faster and is easier to use. In order to take full advantage of it, users must think about the MapReduce concept. In this paper, a usual solution and MapReduce solution of ten problems were compared by their pseudocodes and categorized into five groups. According to these groups’ descriptions and pseudocodes, readers can get a concept of MapReduce without taking specific courses. This paper proposes a five-category classification methodology to help distributed-system users learn the MapReduce paradigm fast. The proposed methodology is illustrated with ten tasks. Furthermore, statistical analysis is carried out to test if the proposed classification methodology affects learner performance. The results of this study indicate that the proposed model outperforms the traditional approach with statistical significance, as evidenced by a p-value of less than 0.05. The policy implication is that educational institutions and organizations could adopt the proposed classification methodology to help learners and employees acquire the necessary knowledge and skills to use distributed systems effectively.
Open Access
Python to learn programming
(ScieTech IOP Publishing, 2013) Bogdanchikov A.; Zhaparov M.; Suliyev R.
Today we have a lot of programming languages that can realize our needs, but the most important question is how to teach programming to beginner students. In this paper we suggest using Python for this purpose, because it is a programming language that has neatly organized syntax and powerful tools to solve any task. Moreover it is very close to simple math thinking. Python is chosen as a primary programming language for freshmen in most of leading universities. Writing code in python is easy. In this paper we give some examples of program codes written in Java, C++ and Python language, and we make a comparison between them. Firstly, this paper proposes advantages of Python language in relation to C++ and JAVA. Then it shows the results of a comparison of short program codes written in three different languages, followed by a discussion on how students understand programming. Finally experimental results of students’ success in programming courses are shown.
Open Access
SEGMENTATION OF OUTLETS USING MACHINE LEARNING CLUSTERING AND CLASSIFICATION ALGORITHMS
(СДУ хабаршысы - 2019, 2019) Nurmambetov D. ; Bogdanchikov A.
Abstract. Segmentation of retail outlets in terms of manufacturing companies’ strategy applied in sales amount and trade activities for each of them is very important. Directed investments into outlets help companies to make more profit and decrease expenses. This study presents a method which can be used for outlets clustering using unsupervised and supervised machine learning algorithms comprising 2 steps – data partition using unsupervised Gaussian Mixture (GM) Model clustering algorithm based on outlets sales amount and further partition for one of them using Logistic Regression (LR) and Neural Networks (NN) classification algorithms, which predict whether outlets will achieve monthly sales plan. Previously, clustering was made without any special methods and clusters were formed using some agreed threshold values for outlets sales amount. The proposed algorithm was tested on real sales data and formed 3 clusters according to business needs. Sales plan achievement prediction gave up to 74% accuracy.
Open Access
STEMMING OF KAZAKH LANGUAGE
(Абай атындағы ҚазҰПУ-нің ХАБАРШЫСЫ, «Физика-математика ғылымдары» сериясы, №1(73), 2021) Bogdanchikov A.; Baimuratov O.A.; Ayazbayev D.A.
Nowadays natural language processing is widely used. For instance, it can be used to translate text, in search engines systems, text topic identification. Such applications require preprocessing of text. It should be done, because preprocessing of text can influence on system accuracy. Text preprocessing can be done by several ways. One approach is identifying root of word. Advantage of identifying root of word is that it can save memory of computer, because repeated roots will be saved one time. This paper describes stemming systems, which can identify root of word. In literature review part authors reviewed to stemming algorithms, which can identify roots of words of Russian, Uzbek, Turkish languages. Then authors proposed stemming system, which can identify root of word of Kazakh language. In current paper authors describe how their system works. To test the system words from various parts of speech were entered. Proposed system can identify roots of noun, verb, adjective, numeral words. The system response can be seen in table 1. Pictures below show what kinds of suffixes, endings can be concatenated with root of word of Kazakh language. However not all combinations are shown in pictures. In conclusion part advices for how to develop stemming system are written.
Open Access
TEACHING BIG DATA ANALYTICAL PLATFORMS IN HIGHER EDUCATION FOR GRADUATE DEGREE STUDENTS
(СДУ хабаршысы - 2018, 2018) Nurkey U.T. ; Bogdanchikov A.
Abstract. An extensive archive of petabytes of data has been generated from modern information systems and digital technologies such as scientific data analysis, social data analysis, reference systems and Internet services journals. To investigate and extract knowledge from this enormous data much effort is needed. Due to this, Big Data Management Systems need to be integrated as part of the computing curriculum. In this article, we present examples of analysed tasks that can be processed as large data projects using Apache Hadoop, and it’s Map – Reduce, Apache Spark, Hive and Pig by demonstrating how each type of system can be integrated via sample datasets and data analysis tasks. The aim of this paper is to show how Big data analyzing tools can be educated through sample tasks, their solution and implementation.
Open Access
The effects of demographic factors on learners’ flow experience in gamified educational quizzes
(Smart Learning Environments, 2025) Issabek A.; Oliveira W.; Hamari J.; Bogdanchikov A.
In recent years, gamification gained widespread adoption in education aiming to increase students’ positive experiences (e.g., motivation, engagement, and flow state). However, the results of using gamification in education are still contradictory, challenging the community to comprehend the influence of individual factors on learners’ experiences within gamified educational systems. To tackle this challenge, this study explored how various demographic factors (i.e., gender, degree, individualism/collectivism, and masculinity/femininity) impact the flow experience of learners in a gamified educational quiz. A quantitative cross-cultural study involving 205 participants was conducted, utilizing partial least squares structural equation modeling to explore the influence of demographic factors on learners’ flow experience in the gamified educational quiz. The analysis revealed that age has a significative positive association with learners’ flow experience, while individualism has a negative association. These findings provide insights into educational technologies and gamification, offering a deeper understanding of how demographic factors shape learners’ flow experience in gamified educational environments.

Browsing by Author "Bogdanchikov A."

Find us

Call us

Mail us

Useful Links

Follow us

Browsing by Author "Bogdanchikov A."

Results Per Page

Sort Options