A Systematic Mapping Study of Code-mixed Embedding Technique

Irfan Mubin Shukri; Rohayanti Hassan; Zalmiyah Zakaria

doi:10.11113/ijic.v15n1.550

Authors

Irfan Mubin Shukri Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
Rohayanti Hassan Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
Zalmiyah Zakaria Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia

DOI:

https://doi.org/10.11113/ijic.v15n1.550

Keywords:

Embedding, Code-Mixed, Model, Language

Abstract

The mix of languages with other languages in a conversation or text is a phenomenon known as code-mixing. This paper presents the existing research on code-mixed embedding techniques, with a focus on identifying research gaps, trends, performance, and methodology approaches. A comprehensive review of 44 peer-reviewed publications from 2016 to 2024 was conducted using online digital libraries. The selected studies were analyzed based on publication trends, language pairs, strengths, and limitations. The result shows a growing interest in the publication trend for papers involving code-mixed embedding techniques, with coverage of several common techniques such as character embedding and word embedding, including Hindi-English language pairs being the most studied. However, several limitations remain in unoptimized models and low-resource language coverage. This mapping provides a structured overview of the field and offers direction for future research in developing more robust and inclusive code-mixed embedding techniques.

A Systematic Mapping Study of Code-mixed Embedding Technique

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

IJIC