<Repository logo
  • English
  • Қазақ
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  • English
  • Қазақ
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of SDU repository
  • GuideRegulations
  1. Home
  2. Browse by Author

Browsing by Author "Barlybay K."

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Question Answering system on Regulatory Documents
    (2023) Barlybay K.
    The domain of legal text processing in the Kazakh language is currently underserved, presenting a unique challenge due to its specialized language and the relative scarcity of computational resources dedicated to it. This thesis explicitly identifies the problem: the need for an efficient model to process, understand, and generate meaningful insights from Kazakh legal texts. Addressing this problem, the thesis proposes a solution by developing and evaluating bespoke language models pre-trained on a vast corpus of Kazakh legal documents. The study begins with the assembly of a corpus, which comprises over 315 million words from Kazakh legal texts, alongside a benchmark dataset of 2500 multiple-choice questions for civil service examinations in Kazakhstan. Three language models based on the BERT architecture are then pre-trained. Among these, one model is pre-trained entirely from scratch. To emulate a real-world application in the legal domain, the performance of these models is assessed using the multiple-choice question-answering task. The BERT base model pre-trained from scratch, leveraging both Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks, achieves an accuracy of 56.11%. This result underlines the potential of custom pre training strategies on domain-specific corpora for enhancing the performance of language models in specialized areas. In conclusion, this research represents a significant advancement in using AI for legal text processing in the Kazakh language. It presents a promising solution to the problem, paving the way for more efficient and informed decision-making processes in legal and civil service settings.

Find us

  • SDU Scientific Library Office B203,
  • Abylaikhana St. 1/1 Kaskelen, Kazakhstan

Call us

Phone: +7 (727) 307 9565 (Int. 183)

Mail us

E-mail: repository@sdu.edu.kz
logo

Useful Links

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback

Follow us

Springshare
ROAR
OpenDOAR

Copyright © 2023, All Right Reserved SDU University