A Systematic Mapping Study of Code-mixed Embedding Technique
DOI:
https://doi.org/10.11113/ijic.v15n1.550Keywords:
Embedding, Code-Mixed, Model, LanguageAbstract
The mix of languages with other languages in a conversation or text is a phenomenon known as code-mixing. This paper presents the existing research on code-mixed embedding techniques, with a focus on identifying research gaps, trends, performance, and methodology approaches. A comprehensive review of 44 peer-reviewed publications from 2016 to 2024 was conducted using online digital libraries. The selected studies were analyzed based on publication trends, language pairs, strengths, and limitations. The result shows a growing interest in the publication trend for papers involving code-mixed embedding techniques, with coverage of several common techniques such as character embedding and word embedding, including Hindi-English language pairs being the most studied. However, several limitations remain in unoptimized models and low-resource language coverage. This mapping provides a structured overview of the field and offers direction for future research in developing more robust and inclusive code-mixed embedding techniques.