TamilATIS: Dataset for Task-Oriented Dialog in Tamil

被引:0
|
作者
Ramaneswaran, S. [1 ]
Vijay, Sanchit [1 ]
Srinivasan, Kathiravan [1 ]
机构
[1] Vellore Inst Technol, Vellore, Tamil Nadu, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Task-Oriented Dialogue (TOD) systems allow users to accomplish tasks by giving directions to the system using natural language utterances. With the widespread adoption of conversational agents and chat platforms, TOD has become mainstream in NLP research today. However, developing TOD systems require massive amounts of data, and there has been limited work done for TOD in low-resource languages like Tamil. Towards this objective, we introduce TamilATIS - a TOD dataset for Tamil which contains 4874 utterances. We present a detailed account of the entire data collection and data annotation process. We train state-of-the-art NLU models and report their performances. The Joint BERT model with XLMRoberta as utterance encoder achieved the highest score with an intent accuracy of 96.26% and slot F1 of 94.01%.
引用
收藏
页码:25 / 32
页数:8
相关论文
共 50 条
  • [1] SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations
    Kottur, Satwik
    Moon, Seungwhan
    Geramifard, Alborz
    Damavandi, Babak
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4903 - 4912
  • [2] A code-mixed task-oriented dialog dataset for medical domain
    Dowlagar, Suman
    Mamidi, Radhika
    [J]. COMPUTER SPEECH AND LANGUAGE, 2023, 78
  • [3] Incremental Dialog Processing in a Task-Oriented Dialog
    Ghigi, Fabrizio
    Eskenazi, Maxine
    Ines Torres, M.
    Lee, Sungjin
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 308 - 312
  • [4] Paraphrase Augmented Task-Oriented Dialog Generation
    Gao, Silin
    Zhang, Yichi
    Ou, Zhijian
    Yu, Zhou
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 639 - 649
  • [5] Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog
    Hung, Chia-Chien
    Lauscher, Anne
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3687 - 3703
  • [6] SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams
    Wu, Te-Lin
    Kottur, Satwik
    Madotto, Andrea
    Azab, Mahmoud
    Rodriguez, Pedro
    Damavandi, Babak
    Peng, Nanyun
    Moon, Seungwhan
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6273 - 6291
  • [7] Robustness Testing of Language Understanding in Task-Oriented Dialog
    Liu, Jiexi
    Takanobui, Ryuichi
    Wen, Jiaxin
    Wan, Dazhen
    Li, Hongguang
    Nie, Weiran
    Li, Cheng
    Peng, Wei
    Huang, Minlie
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2467 - 2480
  • [8] Novel Feature Discovery for Task-Oriented Dialog Systems
    Ho, Vinh Thinh
    Soliman, Mohamed
    Abujabal, Abdalghani
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 782 - 792
  • [9] Recent advances and challenges in task-oriented dialog systems
    Zheng Zhang
    Ryuichi Takanobu
    Qi Zhu
    MinLie Huang
    XiaoYan Zhu
    [J]. Science China Technological Sciences, 2020, 63 : 2011 - 2027
  • [10] Accelerating Natural Language Understanding in Task-Oriented Dialog
    Ahuja, Ojas
    Desai, Shrey
    [J]. NLP FOR CONVERSATIONAL AI, 2020, : 46 - 53