Effect of the Training Set on the Word Embeddings and Similarity Test Set for Turkish

被引：0

作者：

Yucesoy, Veysel ^{[1
]}

Koc, Aykut ^{[1
]}

机构：

[1] ASELSAN Arastirma Merkezi, Akilli Veri Analitigi Arastirma Program Mudurlugu, Ankara, Turkey

来源：

2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU) | 2016年

关键词：

Word embeddings; natural language processing; classification; Turkish similarity test set;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Word embedding, which is usually used in the literature especially for English, is a technique to associate each word to a mathematical vector representation under which some structural or semantic relations hold. There are some Turkish application of this technique. Despite being designed according to English, it is also satisfactory for Turkish. In this study, the performance of Turkish word embeddings is analysed against the convenience of the data to the goal of the embedding. For this study, a new test set based on subject similarity in Turkish is introduced. This set is used to measure the performance of the word embeddings. This set will be publicly available for academic purposes(2). A subject classifier, which beats the state of the art performance, for Turkish labeled text corpus is also proposed.

引用

页码：1005 / 1008

页数：4

共 50 条

[21] Calculating Requirements Similarity Using Word Embeddings
Reddivari, Sandeep
Wolbert, Jeffery
2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 438 - 439
[22] EFFECT OF RESPONSE SET ON A TEST OF UNLEARNING
EPSTEIN, ML
PSYCHOLOGICAL REPORTS, 1973, 33 (02) : 439 - 445
[23] THE EFFECT OF SET ON MOSAIC TEST PERFORMANCE
Horne, E. Porter
Bliss, William
JOURNAL OF GENERAL PSYCHOLOGY, 1955, 53 (02): : 329 - 333
[24] Leveraging Set Relations in Exact Set Similarity Join
Wang, Xubo
Qin, Lu
Lin, Xuemin
Zhang, Ying
Chang, Lijun
PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (09): : 925 - 936
[25] Training Set Similarity Based Parameter Selection for Statistical Machine Translation
Shi, Xuewen
Huang, Heyan
Jian, Ping
Tang, Yi-Kun
WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 63 - 71
[26] Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR
Sheridan, RP
Feuston, BP
Maiorov, VN
Kearsley, SK
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (06): : 1912 - 1928
[27] Similarity-based training set acquisition for continuous handwriting recognition
Sas, Jerzy
Markowska-Kaczmar, Urszula
INFORMATION SCIENCES, 2012, 191 : 226 - 244
[28] Similarity-Based Training Set Recommendation for Software Defect Prediction
Wang, Chao
Yu, Qiao
Han, Hui
Computer Engineering and Applications, 2023, 59 (09) : 86 - 94
[29] WORD SUPERIORITY EFFECT WITH A RESTRICTED SET OF LETTER ALTERNATIVES
SPECTOR, A
PURCELL, DG
FLANIGAN, H
BULLETIN OF THE PSYCHONOMIC SOCIETY, 1976, 8 (04) : 264 - 264
[30] LIMITATIONS ON THE WORD SUPERIORITY EFFECT WITH A FIXED TARGET SET
GREENBERG, SN
KRUEGER, LE
BULLETIN OF THE PSYCHONOMIC SOCIETY, 1980, 15 (01) : 25 - 28

← 1 2 3 4 5 →