Investigating the Statistical Properties of User-Generated Documents

被引:0
|
作者
Inches, Giacomo [1 ]
Carman, Mark James [2 ]
Crestani, Fabio [1 ]
机构
[1] Univ Lugano, Fac Informat, Lugano, Switzerland
[2] Monash Univ, Fac Informat Technol, Melbourne, FL USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The importance of the Internet as a communication medium is reflected in the large amount of documents being generated every day by users of the different services that take place online. In this work we aim at analyzing the properties of these online user-generated documents for some of the established services over the Internet (Kongregate, Twitter, Myspace and Slashdot) and comparing them with a consolidated collection of standard information retrieval documents (from the Wall Street Journal, Associated Press and Financial Times, as part of the TREC ad-hoc collection). We investigate features such as document similarity; term burstiness, emoticons and Part-Of-Speech analysis, highlighting the applicability and limits of traditional content analysis and indexing techniques used in information retrieval to the new online user-generated documents.
引用
收藏
页码:198 / +
页数:3
相关论文
共 50 条
  • [21] Quality Characteristics for User-Generated Content
    Musto, Jiri
    Dahanayake, Ajantha
    [J]. Musto, Jiri (jiri.musto@lut.fi), 1600, IOS Press BV (343): : 244 - 263
  • [22] User-Generated Content on Online Platforms: A Novel Method for Investigating Heritage Destination Value
    Choo, Ling Suan
    [J]. HERITAGE AND SOCIETY, 2023,
  • [23] Investigating Fertility Intentions for a Second Child in Contemporary China Based on User-Generated Content
    Qian, Ying
    Liu, Xiao-ying
    Fang, Bing
    Zhang, Fan
    Gao, Rui
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2020, 17 (11)
  • [24] Investigating Segment-Based Query Expansion for User-Generated Spoken Content Retrieval
    Khwileh, Ahmad
    Jones, Gareth J. F.
    [J]. 2016 14TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2016,
  • [25] Tapping the grapevine: User-generated content
    Figallo, C
    Rhine, N
    [J]. ECONTENT, 2001, 24 (03) : 38 - +
  • [26] Multimodal Summarization of User-Generated Videos
    Psallidas, Theodoros
    Koromilas, Panagiotis
    Giannakopoulos, Theodoros
    Spyrou, Evaggelos
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (11):
  • [27] WARMING UP TO USER-GENERATED CONTENT
    Lee, Edward
    [J]. UNIVERSITY OF ILLINOIS LAW REVIEW, 2008, (05): : 1459 - 1548
  • [28] Predicting Emotions in User-Generated Videos
    Jiang, Yu-Gang
    Xu, Baohan
    Xue, Xiangyang
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 73 - 79
  • [29] Assessing the Quality of User-Generated Content
    Stefan Winkler
    [J]. ZTE Communications, 2013, 11 (01) : 37 - 40
  • [30] The future of user-generated content is now
    Marino, Gregoire
    [J]. JOURNAL OF INTELLECTUAL PROPERTY LAW & PRACTICE, 2013, 8 (03) : 183 - 183