A Systematic Mapping Study of Code-mixed Embedding Technique

Authors

  • Irfan Mubin Shukri Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
  • Rohayanti Hassan Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
  • Zalmiyah Zakaria Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia

DOI:

https://doi.org/10.11113/ijic.v15n1.550

Keywords:

Embedding, Code-Mixed, Model, Language

Abstract

The mix of languages with other languages in a conversation or text is a phenomenon known as code-mixing. This paper presents the existing research on code-mixed embedding techniques, with a focus on identifying research gaps, trends, performance, and methodology approaches. A comprehensive review of 44 peer-reviewed publications from 2016 to 2024 was conducted using online digital libraries. The selected studies were analyzed based on publication trends, language pairs, strengths, and limitations. The result shows a growing interest in the publication trend for papers involving code-mixed embedding techniques, with coverage of several common techniques such as character embedding and word embedding, including Hindi-English language pairs being the most studied. However, several limitations remain in unoptimized models and low-resource language coverage. This mapping provides a structured overview of the field and offers direction for future research in developing more robust and inclusive code-mixed embedding techniques.

Downloads

Published

2025-05-27

How to Cite

Shukri, I. M., Hassan, R., & Zakaria, Z. (2025). A Systematic Mapping Study of Code-mixed Embedding Technique . International Journal of Innovative Computing, 15(1), 119–129. https://doi.org/10.11113/ijic.v15n1.550

Issue

Section

Article