Text Classification for AI Generated Content with Machine Learning and Deep Learning Models

dc.contributor.authorBatyr Sharimbayev
dc.contributor.authorShirali Kadyrov
dc.date.accessioned2025-11-13T06:20:02Z
dc.date.available2025-11-13T06:20:02Z
dc.date.issued2025
dc.description.abstractThe rapid development of generative AI models, such as GPT-4, LLaMA, and Gemini, is causing an explosion of AI-generated text that may be akin to human writing. This poses a challenge in differentiating between AI generated content and human-authored text across a range of verticals: academic integrity, misinformation detection, and content moderation. This paper presents a comparison of machine learning and deep learning models on the classifier for AI-generated text. We compare the performance of Logistic Regression with TF-IDF features, a Bi-LSTM model, and a fine-tuned DistilBERT model on data from the COLING Workshop on MGT Detection Task 1, involving text samples from five AI models and human authors. Our experiments showed that Bi-LSTM outperforms other models, yielding the best results in accuracy (90.09%) and F1-score (90.02%). We further present the binary classification performance that distinguishes AI-generated text from human-written content, with an accuracy of 95.9%. It is suggested that deep learning methods are competent in detecting AI-generated text, though there are certain limitations, including adversarial attacks and changing styles of AI-generated writing. Future work will be focused on enhancing model robustness through adversarial training and hybrid architectures.
dc.identifier.citationBatyr Sharimbayev, Shirali Kadyrov / Text Classification for AI Generated Content with Machine Learning and Deep Learning Models / IEEE 5th International Conference on Smart Information Systems and Technologies (SIST) / 2025
dc.identifier.urihttps://repository.sdu.edu.kz/handle/123456789/2186
dc.language.isoen
dc.publisher5th International Conference on Smart Information Systems and Technologies (SIST)
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subjectAI-generated text
dc.subjecttext classification
dc.subjectdeep learning
dc.subjectBi-LSTM
dc.subjectDistilBERT
dc.subjectmachine learning
dc.titleText Classification for AI Generated Content with Machine Learning and Deep Learning Models
dc.typeArticle

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TextClassificationforAIGeneratedContentwithMachineLearningandDeepLearningModels_.pdf
Size:
474.77 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
12.6 KB
Format:
Item-specific license agreed to upon submission
Description: