Deep Learning based Tamil Parts of Speech (POS) Tagger

Journal title

Bulletin of the Polish Academy of Sciences: Technical Sciences








Anbukkarasi, S. : Department of Computer Science and Engineering, Kongu Engineering College, India ; Varadhaganapathy, S. : Department of Information Technology, Kongu Engineering College, India



POS tagging ; deep learning model ; natural language processing ; Bi-LSTM

Divisions of PAS

Nauki Techniczne




  1.  R. Rajimol and V.S. Anoop, “A framework for named entity recognition for Malayalam – A Comparison of different deep learning ar- chitectures,” Nat. Lang. Process. Res., vol. 1, pp.  14–22, 2020.
  2.  Y. Liu et al., “Multilingual denoising pre-training for neural machine translation,” Trans. Assoc. Comput. Ling., vol. 8, pp. 726–742, 2020.
  3.  K.S. Kalaivani and S. Kuppuswami, “Exploring the use of syntactic dependency features for document-level sentiment classification,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 67, pp. 339–347, 2019, doi: 10.24425/bpas.2019.128608.
  4.  S. Anbukkarasi and S. Varadhaganapathy, “Machine Translation (MT) techniques for Indian Languages,” Int. J. Recent Technol. Eng., vol. 8, 86–90, 2019, doi: 10.35940/ijrte.B1015.0782S419.
  5.  E. Brill, “A simple rule-based part of speech tagger,” in Proc. 3rd Conference on Applied Natural Language Processing, Association for Computational Linguistics, 1992, pp. 152–155, doi: 10.3115/974499.974526.
  6.  T. Berg-Kirkpatrick, A. Bouchard-Côté, J. DeNero, and D. Klein, “Painless unsupervised learning with features,” in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, pp. 582–590.
  7.  N. Bölücü and B. Can, “Joint PoS tagging and stemming for agglutinative languages,” in Proc. of the International Conference on Com- putational Linguistics and Intelligent Text Processing, 2017, pp. 110–122.
  8.  P. Arulmozhi, T. Pattabhi R.K. Rao and L. Sobha, “A Hybrid POS Tagger for a Relatively Free Word Order Language,” [Online]. Available (Accessed: Jan, 10, 2021)
  9.  J. Singh, N. Joshi, and I. Mathur, “Development of Marathi part of speech tagger using statistical approach,” in Proc. of International Conference on Advances in Computing, Communications and Informatics, 2013, pp. 1554–1559.
  10.  M. Ramanathan, V. Chidambaram, and A. Patro, “An Attempt at Multilingual POS Tagging for Tamil,” [Online]. Available http://pages. (Accessed: Jan. 10. 2021).
  11.  N. Bölücü, B. Can, “A Cascaded Unsupervised Model for PoS Tagging,” ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 20, pp. 1–23, Mar. 2021, doi: 10.1145/3447759.
  12.  S. Adinarayanan and N.S. Ranjaniee, “Part-of speech tagger for sanskrit. A state of art survey,” Int. J. Appl. Eng. Res., vol. 10, pp. 24173– 24178, 2015. doi: 10.37200/IJPR/V23I1/PR190243.
  13.  H. Ali, Unsupervised Parts-of-Speech Tagger for the Bangla language, Department of Computer Science. University of British Colum- bia, 2010. [Online]. Available: (Accessed: Jan. 10. 2021).
  14.  K. Stratos, M. Collins, and D. Hsu, “Unsupervised part-of-speech tagging with anchor hidden markov models,” Trans. Assoc. Comput. Ling., vol. 4, pp. 245–257, 2016, doi: 10.1162/tacl_a_00096.
  15.  K. Sarkar and V. Gayen, “A trigram HMM-based POS tagger for Indian languages,” in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA), 2013, pp. 205–212.
  16.  M. Banko and R.C. Moore, “Part of speech tagging in context,” in Proc. 20th International Conference on Computational Linguistics, 2004, 556, doi: 10.3115/1220355.1220435.
  17.  Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for sequence tagging,” 2015. [Online]. Available: abs/1508.01991 (Accessed: Jan. 10. 2021).
  18.  M. Thayaparan, S. Ranathunga, and U. Thayasivam, “Graph Based Semi-Supervised Learning for Tamil POS Tagging.” FIRE 2014, [Online]. Available: (Accessed: Jan. 10. 2021).
  19.  B. Plank, A. Søgaard, and Y. Goldberg, “Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss,” in Proc. 54th Annu. Association for Computational Linguistics, 2016, pp. 412–418.
  20.  M. Rajasekar and A. Udhayakumar, “POS Tagging Using Naive Bayes Algorithm For Tamil,” Int. J. Sci. Eng. Technol. Res., vol. 9, pp. 574–578, Feb. 2020.
  21.  J. Singh, L. Singh Garcha, and S. Singh, “A Survey on Parts of Speech Tagging for Indian Languages,” Int. J. Adv. Res. Comput. Sci. Software Eng., vol. 7, no. 4, Apr. 2017.
  22.  V. Dhanalakshmi, A.M. Kumar, and K.P. Soman, and S. Rajendran, “POS Tagger and Chunker for Tamil Language,” Proceedings of the 8th Tamil Internet Conference, Cologne, Germany, 2009.
  23.  K.K. Akhil, R. Rajimol, and V.S. Anoop, “Parts-of-Speech tagging for Malayalam using deep learning techniques,” Int. J. Inf. Technol., vol. 12, pp. 741–748, 2020, doi: 10.1007/s41870-020-00491-z.
  24.  E. Lukasik et al., “Recognition of handwritten Latin characters with diacritics using CNN,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 69, no. 1, p. e136210, 2021, doi: 10.24425/bpasts.2020.136210.
  25.  D. Andor et al., “Globally normalized transition-based neural networks,” in Proc. 54th Annu. Association for Computational Linguistics, Berlin, Germany, 2016, pp. 2442–2452.
  26.  M. Yan et al., “A deep cascade model for multi-document reading comprehension,” in Proc. of The Thirty-Third AAAI Conference on Artificial Intelligence, 2018, pp. 7354–7361.
  27.  P. Wang, Y. Qian, F.K. Soong, L. He, and Z. Hai, “Part-of-speech tagging with bidirectional long short-term memory recurrent neural network,” [Online]. Available:
  28.  Keras, [Online] Available: (Accessed: 30.03.21).






DOI: 10.24425/bpasts.2021.138820