IMP lab. publication database: Detail of Publication

Detail of Publication

Text Language	English
Authors	Rina Buoy, Masakazu Iwamura, Sovila Srun, Koichi Kise
Title	Language-Aware Non-Autoregressive Khmer Textline Recognition Using Khmer Subword Model
Journal	Proc. International Conference on Pattern Recognition and Artificial Intelligence
Number of Pages	16 pages
Location	Jeju, Korea
Reviewed or not	Reviewed
Presentation type	Oral
Month & Year	July 2024
Abstract	Unlike the Latin script, Khmer does not use spaces between words, leading to text recognition typically being done at the textline level. This can involve a vast number of characters and results in high latency for a language-aware autoregressive (AR) decoder that generates one character at a time. On the other hand, a non-autoregressive (NAR) decoder generates all characters in parallel, but it is not language-aware. In this paper, we introduce an efficient Khmer textline recognition method based on a NAR decoder, ensuring low decoding latency while maintaining linguistic awareness. This is achieved by utilizing a Khmer-specific subword modeling called Khmer character clusters (KCC) that capture the syntactic, morphological, and orthographic aspects of the Khmer script. Therefore, instead of conventional character-level recognition, the proposed method recognizes all character clusters or subwords in parallel. The experimental results demonstrate that the proposed method outperforms the character-level baseline NAR model in terms of recognition accuracy while maintaining the same low latency. When compared with the character-level baseline AR model, the proposed method achieves comparable or improved recognition accuracy while also achieving significantly lower latency. When compared with the recent state-of-the-art (SOTA) NAR and AR Khmer text recognition methods, our proposed method achieves superior recognition performance.

Entry for BibTeX

@InProceedings{Buoy2024,
  author =	{Rina Buoy and Masakazu Iwamura and Sovila Srun and Koichi Kise},
  title =	{Language-Aware Non-Autoregressive Khmer Textline Recognition Using Khmer Subword Model},
  booktitle =	{Proc. International Conference on Pattern Recognition and Artificial Intelligence},
  year =	2024,
  month =	jul,
  numpages =	{16},
  location =	{Jeju, Korea}
}

Back to list

Homepage
-------
List of Publications
-------
Search for Publications
=======
Page for Management (Only for lab members)