Document Distance Metric Learning in an Interactive Exploration Process

被引：0

作者：

Wrzalik, Marco ^{[1
]}

机构：

[1] RheinMain Univ Appl Sci, Wiesbaden, Germany

来源：

PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19) | 2019年

关键词：

D O I：

10.1145/3331184.3331420

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visualization of inter-document similarities is widely used for the exploration of document collections and interactive retrieval [1, 2]. However, similarity relationships between documents are multifaceted and measured distances by a given metric often do not match the perceived similarity of human beings. Furthermore, the user's notion of similarity can drastically change with the exploration objective or task at hand. Therefore, this research proposes to investigate online adjustments to the similarity model using feedback generated during exploration or exploratory search. In this course, rich visualizations and interactions will support users to give valuable feedback. Based on this, metric learning methodologies will be applied to adjust a similarity model in order to improve the exploration experience. At the same time, trained models are considered as valuable outcomes whose benefits for similarity-based tasks such as query-by-example retrieval or classification will be tested. The measurement of inter-document similarities has been extensively studied in the past. There are various distance metrics using different representations such as weighted term vectors (e.g. TF-IDF, BM25) [9], distributions from topic models [7] or distributed representations from pre-trained language models [5]. Learning a metric can create improved similarity measures that fit specific domain characteristics or the requirements of a task at hand. Learning to rank has attracted much research towards this matter in the IR community. Related works form, together with other findings regarding metric learning, the groundwork for this research. In total, highly diverse approaches can be found: linear projections of term vectors [10]; pattern matching in sequences of word embeddings using convolutional neural networks [8]; word sequence learning using siamese recurrent neural networks [6]; to name a few. Approaches using online feedback are particularly relevant to this research. There, collecting implicit feedback based on result lists such as observing clicks [3] or dwell times [4] are common feedback modalities. However, there is only little research on metric learning using feedback from interactions with rich visualizations of inter-document similarities such as proposed in [1]. We hypothesize that users can generate more valuable feedback while interacting with an explorable visualization than with a simple list of best hits. This can be argued with a more comprehensive understanding of underlying similarity relationships such visualizations can give and with the greater range of possible feedback modalities. In a spatial visualization, for example, feedback could be given by correcting datapoint positions, drawing lines as borders for desired clusters or rating the desirability of similarity relationships between result documents. Following the above-mentioned considerations, the research questions we intend to pursue are: (i) Which feedback modalities enable users to express the desired similarity measure and how can interactive visualizations support users to generate feedback effectively? (ii) Which metric learning methodologies are applicable to improve a similarity model using the feedback from the proposed modalities? (iii) Can a visual exploratory search using the outcome of (i) and (ii) demonstrate arguable benefits over classic searches using result list presentations?

引用

页码：1452 / 1452

页数：1

共 50 条

[1] Regularized distance metric learning for document classification and its application
Department of Industrial and Management Systems Engineering, School of Creative Science and Engineering, Waseda University, Japan
[J]. J. Jpn Ind. Manage. Assoc., 2E (190-203):
[2] Multiple Kernel Learning via Distance Metric Learning for Interactive Image Retrieval
Yan, Fei
Mikolajczyk, Krystian
Kittler, Josef
[J]. MULTIPLE CLASSIFIER SYSTEMS, 2011, 6713 : 147 - 156
[3] The distance in the process of teaching-learning - discussing the metric
Gozzi, Marcelo Pupim
Simplicio Junior, Marcos Antonio
Beingolea Garay, Jorge Rodolfo
[J]. DIALOGIA, 2010, 9 (01): : 73 - 84
[4] MindMiner: A Mixed-Initiative Interface for Interactive Distance Metric Learning
Fan, Xiangmin
Liu, Youming
Cao, Nan
Hong, Jason
Wang, Jingtao
[J]. HUMAN-COMPUTER INTERACTION - INTERACT 2015, PT II, 2015, 9297 : 611 - 628
[5] Curvilinear Distance Metric Learning
Chen, Shuo
Luo, Lei
Yang, Jian
Gong, Chen
Li, Jun
Huang, Heng
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] Sparse distance metric learning
Choy, Tze
Meinshausen, Nicolai
[J]. COMPUTATIONAL STATISTICS, 2014, 29 (3-4) : 515 - 528
[7] Sparse distance metric learning
Tze Choy
Nicolai Meinshausen
[J]. Computational Statistics, 2014, 29 : 515 - 528
[8] Distance metric learning with the Universum
Bac Nguyen
Morell, Carlos
De Baets, Bernard
[J]. PATTERN RECOGNITION LETTERS, 2017, 100 : 37 - 43
[9] Distance metric learning by minimal distance maximization
Yu, Yaoliang
Jiang, Jiayan
Zhang, Liming
[J]. PATTERN RECOGNITION, 2011, 44 (03) : 639 - 649
[10] Text Document Clustering with Metric Learning
Wang, Jinlong
Wu, Shunyao
Huy Quan Vu
Li, Gang
[J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 783 - 784

← 1 2 3 4 5 →