A machine learning model for information retrieval with structured documents

被引:0
|
作者
Piwowarski, B [1 ]
Gallinari, P [1 ]
机构
[1] Univ Paris 06, LIP6, F-75015 Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most recent document standards rely on structured representations. On the other hand, current information retrieval systems have been developed for flat document representations and cannot be easily extended to cope with more complex document types. Only a few models have been proposed for handling structured documents, and the design of such systems is still an open problem. We present here a new model for structured document retrieval which allows to compute and to combine the scores of document parts. It is based on bayesian networks and allows for learning the model parameters in the presence of incomplete data. We present an application of this model for ad-hoc retrieval and evaluate its performances on a small structured collection. The model can also be extended to cope with other tasks such as interactive navigation in structured documents or corpus.
引用
收藏
页码:425 / 438
页数:14
相关论文
共 50 条
  • [21] A model for the representation and focussed retrieval of structured documents based on fuzzy aggregation
    Kazai, G
    Lalmas, M
    Rölleke, T
    [J]. EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2001, : 123 - 135
  • [22] An evolutionary strategy with machine learning for learning to rank in information retrieval
    Osman Ali Sadek Ibrahim
    D. Landa-Silva
    [J]. Soft Computing, 2018, 22 : 3171 - 3185
  • [23] An evolutionary strategy with machine learning for learning to rank in information retrieval
    Ibrahim, Osman Ali Sadek
    Landa-Silva, D.
    [J]. SOFT COMPUTING, 2018, 22 (10) : 3171 - 3185
  • [24] ANNOTATIONS ON DOCUMENTS FOR INFORMATION RETRIEVAL
    Patil, Vishal A.
    Khambre, Pankaj
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [25] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu Gao Jianfeng Li Sheng Ministry of Education Microsoft Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China Microsoft Research Asia Beijing China Microsoft Research Redmond WA USA
    [J]. JournalofElectronics, 2006, (06) : 933 - 936
  • [26] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu* Gao Jianfeng** Li Sheng (Ministry of Education - Microsoft Key Laboratory of Natural Language Processing and Speech (Harbin Institute of Technology)
    [J]. Journal of Electronics(China), 2006, (06) : 933 - 936
  • [27] Applying machine learning to text segmentation for information retrieval
    Huang, XJ
    Peng, FC
    Schuurmans, D
    Cercone, N
    Robertson, SE
    [J]. INFORMATION RETRIEVAL, 2003, 6 (3-4): : 333 - 362
  • [28] Towards Reproducible Machine Learning Research in Information Retrieval
    Lucic, Ana
    Bleeker, Maurits
    de Rijke, Maarten
    Sinha, Koustuv
    Jullien, Sami
    Stojnic, Robert
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3459 - 3461
  • [29] On Machine Learning and Knowledge Organization in Multimedia Information Retrieval
    Macfarlane, Andrew
    Missaoui, Sondess
    Frankowska-Takhari, Sylwia
    [J]. KNOWLEDGE ORGANIZATION, 2020, 47 (01): : 45 - 55
  • [30] Automated Machine Learning for Information Retrieval in Scientific Articles
    Rakhshani, Hojjat
    Latard, Bastien
    Brevilliers, Mathieu
    Weber, Jonathan
    Lepagnot, Julien
    Forestier, Germain
    Hassenforder, Michel
    Idoumghar, Lhassane
    [J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,