Study on Unknown Term Translation Mining from Google Snippets

被引:2
|
作者
Li, Bin [1 ]
Yao, Jianmin [2 ]
机构
[1] Anhui Open Univ, Sch Informat Engn, Hefei 230041, Anhui, Peoples R China
[2] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou 215006, Peoples R China
基金
中国国家自然科学基金;
关键词
unknown term; translation mining; web mining; google snippets;
D O I
10.3390/info10090267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the subject terms and then expanded the source query with the translation of the subject terms to collect effective bilingual search engine snippets. Afterwards, valid candidates were extracted from small-sized, noisy bilingual corpora using an improved frequency change measurement that combines adjacent information. This research developed a method that considers surface patterns, frequency-distance, and phonetic features to elect an appropriate translation. The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Artificial Intelligence in Academic Translation: A Comparative Study of Large Language Models and Google Translate
    Mohsen, Mohammed Ali
    PSYCHOLINGUISTICS, 2024, 35 (02): : 134 - 156
  • [22] Handling Unknown Words in Statistical Machine Translation from a New Perspective
    Zhang, Jiajun
    Zhai, Feifei
    Zong, Chengqing
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, 2012, 333 : 176 - 187
  • [23] Mining Parallel Resources for Machine Translation from Comparable Corpora
    Pal, Santanu
    Pakray, Partha
    Gelbukh, Alexander
    van Genabith, Josef
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 534 - 544
  • [24] Query expansion for mining translation knowledge from comparable data
    Xiang, Lu
    Zhou, Yu
    Hao, Jie
    Zhang, Dakun
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8801 : 200 - 211
  • [25] Query Expansion for Mining Translation Knowledge from Comparable Data
    Xiang, Lu
    Zhou, Yu
    Hao, Jie
    Zhang, Dakun
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 200 - 211
  • [26] Translation of Idioms from Arabic into Malay via Google Translate: What Needs to Be Done?
    Abidin, Enid Zureen Zainal
    Mustapha, Nik Farhan
    Abd Rahim, Normaliza
    Abdullah, Syed Nurulakla Syed
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2020, 20 (03): : 156 - 180
  • [27] Evaluating L2 Phrases from Google Translation and L1-L2 Dictionaries Using Google Search
    Thanh-Dung Dang
    Kim-Giao Dang Thi
    2016 3RD INTERNATIONAL CONFERENCE ON GREEN TECHNOLOGY AND SUSTAINABLE DEVELOPMENT (GTSD), 2016, : 132 - 134
  • [28] Web Data Mining: Validity of Data from Google Earth for Food Retail Evaluation
    de Menezes, Mariana Carvalho
    de Matos, Vanderlei Pascoal
    de Pina, Maria de Fatima
    de Lima Costa, Bruna Vieira
    Mendes, Larissa Loures
    Pessoa, Milene Cristine
    de Souza-Junior, Paulo Roberto Borges
    de Lima Friche, Amelia Augusta
    Caiaffa, Waleska Teixeira
    de Oliveira Cardoso, Leticia
    JOURNAL OF URBAN HEALTH-BULLETIN OF THE NEW YORK ACADEMY OF MEDICINE, 2021, 98 (02): : 285 - 295
  • [29] Data Mining From Web Search Queries: A Comparison of Google Trends and Baidu Index
    Vaughan, Liwen
    Chen, Yue
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2015, 66 (01) : 13 - 22
  • [30] Python']Python data odyssey: Mining user feedback from google play store
    Yasin, Affan
    Fatima, Rubia
    Ghazi, Ahmad Nauman
    Wei, Ziqi
    DATA IN BRIEF, 2024, 54