Text Classification for AI Generated Content  with Machine Learning and Deep Learning  Models

Batyr Sharimbayev; Shirali Kadyrov

Text Classification for AI Generated Content with Machine Learning and Deep Learning Models

Files

TextClassificationforAIGeneratedContentwithMachineLearningandDeepLearningModels_.pdf (474.77 KB)

Date

2025

Authors

Batyr Sharimbayev

Shirali Kadyrov

Publisher

5th International Conference on Smart Information Systems and Technologies (SIST)

Abstract

The rapid development of generative AI models, such as GPT-4, LLaMA, and Gemini, is causing an explosion of AI-generated text that may be akin to human writing. This poses a challenge in differentiating between AI generated content and human-authored text across a range of verticals: academic integrity, misinformation detection, and content moderation. This paper presents a comparison of machine learning and deep learning models on the classifier for AI-generated text. We compare the performance of Logistic Regression with TF-IDF features, a Bi-LSTM model, and a fine-tuned DistilBERT model on data from the COLING Workshop on MGT Detection Task 1, involving text samples from five AI models and human authors. Our experiments showed that Bi-LSTM outperforms other models, yielding the best results in accuracy (90.09%) and F1-score (90.02%). We further present the binary classification performance that distinguishes AI-generated text from human-written content, with an accuracy of 95.9%. It is suggested that deep learning methods are competent in detecting AI-generated text, though there are certain limitations, including adversarial attacks and changing styles of AI-generated writing. Future work will be focused on enhancing model robustness through adversarial training and hybrid architectures.

Keywords

AI-generated text, text classification, deep learning, Bi-LSTM, DistilBERT, machine learning

Citation

Batyr Sharimbayev, Shirali Kadyrov / Text Classification for AI Generated Content with Machine Learning and Deep Learning Models / IEEE 5th International Conference on Smart Information Systems and Technologies (SIST) / 2025

URI

https://repository.sdu.edu.kz/handle/123456789/2186

Collections

3. Articles and Papers

Full item page

Text Classification for AI Generated Content with Machine Learning and Deep Learning Models

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Find us

Call us

Mail us

Useful Links

Follow us