2 results
Search Results
Now showing 1 - 2 of 2
Item Open Access Analysis of data to improve system of an educational organization(2020) Serek A.In this thesis, there was done a set of computational experiments on the datasets of educational organizations. In the first part of the work, there were executed lots of experiments with decision tree ID3 algorithm on the educational camp dataset (”Educon”) that automatically predicts a participant’s feedback using Scikit-learn library and extrapolatory data analysis of that was done using Pandas library and Python programming language. The experimental results showed that the most optimal maximum depth (which is the number of edges starting from the root till the leaf) for the decision tree is 3 and the most optimal minimum number of splits (which is the minimum amount of samples of the dataset that are required to split an internal node of the decision tree) is 192. Based on that, there was achieved optimum results of precision, recall, and f1 score machine learning metrics that vary between 75 to 98 depending on the change of tunable variables of the ID3 algorithm. In the subsequent parts of the thesis, the information extraction system was built based on an educational camp dataset and recommendations for hackathon improvement were derived. The datasets are not open-source and were collected manually through the use of surveys.Item Open Access Predictive Analytics for Student Engagement in E-Learning Systems(SDU University, 2025) Murattaly Y.; Serek A.To increase the success of students’ education, it is important to be able to predict the level of their involvement in the online educational environment. This study uses the Open University Learning Analytics (OULAD) open dataset to develop a systematic and reproducible approach to classifying student engagement. On the other hand, many other studies depend on specific datasets or limited definitions of engagement. A full cycle of data preprocessing and feature extraction was implemented, aimed at obtaining informative behavioral indicators based on click data and evaluation results. We trained and tested two traditional supervised machine learning model, Random Forest and Logistic Regression, using weight and macro-average metrics. The random forest model demonstrated high efficiency across all interaction classes and showed higher accuracy (0.926) compared to logistic regression (0.896). The results obtained emphasize the importance of high-quality data preprocessing and thoughtful design of features. In addition, they confirm that such signs provide valuable information for the development of early warning systems and the further development of educational analytics in higher education institutions.