A NEW COMPUTATIONAL MODEL FOR TURKIC LANGUAGES MORPHOLOGY AND PROCESSING

Ualsher Tukeyev

doi:10.26577/JPCSIT.2023.v1.i1.07

Authors

Ualsher Tukeyev Al-Farabi Kazakh National University, Almaty, Kazakhstan http://orcid.org/0000-0001-9878-981X

DOI:

https://doi.org/10.26577/JPCSIT.2023.v1.i1.07

Keywords:

Computational model, Turkic languages, Morphology, Endings, Natural language processing

Abstract

Abstract. Effective communication between representatives of different nations in the modern global world has become a very relevant problem. Towards its solution, considerable support can come from artificial intelligence tools and, in particular, from natural language processing components. Along this direction, this article proposes the development and the exploitation of new computational morphology model for Turkic languages, based on a complete set of endings (CSE - model). Based on the CSE-model of morphology, a methodology has been developed for the creation and use of universal programs (data-driven) for processing natural languages. These include word stemming, text segmentation and morphological analysis. One advantage of the proposed methodology is that it is oriented towards linguists that only have to prepare i) a list of complete sets of endings for new languages according to the described method, and ii) a list of stop words that do not have endings. Then, based on the prepared lists, the developed universal programs for stemming, segmentation, morphological analysis are used. Experiments carried out for the Kazakh, Kyrgyz and Uzbek languages show a high efficiency of the proposed morphology model, algorithms and tools.

A NEW COMPUTATIONAL MODEL FOR TURKIC LANGUAGES MORPHOLOGY AND PROCESSING

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biography

Ualsher Tukeyev, Al-Farabi Kazakh National University, Almaty, Kazakhstan

Downloads

How to Cite

Issue

Section

Information

Make a Submission

Current Issue