Cross-lingual analysis of English and Chinese web search

被引:0
|
作者
Lin, Peiguang [1 ]
Zhang, Tong [2 ]
Xia, Menglong [3 ]
Zhou, Jin [4 ]
Nie, Peiyao [1 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250001, Shandong, Peoples R China
[2] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510000, Guangdong, Peoples R China
[3] Macau Univ Sci & Technol, Fac Hospitality & Tourism Management, Ave Wai Long, Taipa 999078, Macau, Peoples R China
[4] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan 250001, Shandong, Peoples R China
关键词
cross-lingual analysis; web search analysis; search query; POS distribution; search session; session entropy; query reformulation; click graph analysis; query features; web search burstiness; ENGINE; ALGORITHM; BEHAVIOR;
D O I
10.1504/IJWGS.2018.095663
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is a growing number of the non-English Web in recent years. So the language-dependent and user-based search paradigms are becoming increasingly important for search engines. Unfortunately, most of the works are available on web search analysis are still English-based. In order to understand the behavioural commonality and distinction of non-English users, we propose a framework for analysing the web search behaviours of users in a cross-lingual context. This framework is composed of 10 factors, which can be applied at the query level, session level and corpus level respectively. The integral employment of these factors could help us with characterising the user behaviour of web search, even in different languages, with regard to both statistical and semantic perspectives. This framework shows a better efficiency not only in revealing the commonality and distinction of web search, but also in informing the design of search paradigms in a cross-lingual scenario.
引用
收藏
页码:376 / 399
页数:24
相关论文
共 50 条
  • [41] Reinforced Transformer with Cross-Lingual Distillation for Cross-Lingual Aspect Sentiment Classification
    Wu, Hanqian
    Wang, Zhike
    Qing, Feng
    Li, Shoushan
    ELECTRONICS, 2021, 10 (03) : 1 - 14
  • [42] Learning Tibetan-Chinese cross-lingual word embeddings
    Ma, Wei
    Yu, Hongzhi
    Zhao, Kun
    Zhao, Deshun
    2019 15TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2019), 2019, : 49 - 53
  • [43] CISA: Chinese Information Structure Analysis for Scientific Writing with Cross-lingual Adversarial Learning
    Huang, Hen-Hsen
    Chen, Hsin-Hsi
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5832 - 5834
  • [44] Exploring Web-Based Translation Resources Applied to Hindi-English Cross-Lingual Information Retrieval
    Sharma, Vijay
    Mittal, Namita
    Vidyarthi, Ankit
    Gupta, Deepak
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (01)
  • [45] A fast forward approach to cross-lingual question answering for English and German
    Stroetgen, Robert
    Mandl, Thomas
    Schneider, Rene
    ACCESSING MULTILINGUAL INFORMATION REPOSITORIES, 2006, 4022 : 332 - 336
  • [46] Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English
    Perez, Naiara
    Accuosto, Pablo
    Bravo, Alex
    Cuadros, Montse
    Martinez-Garcia, Eva
    Saggion, Horacio
    Rigau, German
    BIOINFORMATICS, 2020, 36 (06) : 1872 - 1880
  • [47] CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
    El-Kishky, Ahmed
    Chaudhary, Vishrav
    Guzman, Francisco
    Koehn, Philipp
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5960 - 5969
  • [48] English to Hindi Cross-Lingual Text Summarizer using TextRank Algorithm
    Rawat, Sunita
    Kalambe, Kavita
    Jaywant, Sagarika
    Werulkar, Lakshita
    Barbate, Mukul
    Jaiswal, Tarrun
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 238 - 245
  • [49] ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model
    Lachraf, Raki
    Nagoudi, El Moatez Billah
    Ayachi, Youcef
    Abdelali, Ahmed
    Schwab, Didier
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 40 - 48
  • [50] SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism
    Fatima, Mehwish
    Kolber, Tim
    Markert, Katja
    Strube, Michael
    NewSumm 2023 - Proceedings of the 4th New Frontiers in Summarization Workshop, Proceedings of EMNLP Workshop, 2023, : 24 - 40