Website Classification Using Latent Dirichlet Allocation and its Application for Internet Advertising

被引:0
|
作者
Katsumata, Sotaro [1 ]
Motohashi, Eiji [2 ]
Nishimoto, Akihiro [3 ]
Toyosawa, Eiji [4 ]
机构
[1] Osaka Univ, Graduage Sch Econ, 1-7 Machikaneyama, Toyonaka, Osaka 5650043, Japan
[2] Yokohama Natl Univ, Grad Sch Int Social Sci, Hodogaya Ku, 79-4 Tokiwadai, Yokohama, Kanagawa 2408501, Japan
[3] Kwansei Gakuin Univ, Sch Business Adm, 1-155 Uegahara Ichiban Cho, Nishinomiya, Hyogo 6628501, Japan
[4] F N Commun Inc, Shibuya Ku, Aoyama Diamond Bldg,Recept 2nd Floor, Tokyo 1500002, Japan
关键词
D O I
10.1109/ICDMW.2016.141
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study proposes a model for website classification using website content, and discusses applications for internet advertising (ad) strategies. Internet ad agencies have many ad-spaces embedded in many websites and can choose where to place advertisements. Therefore, ad agencies have to know the properties and topics of each website in order to optimize advertising submission strategy. However, since website content is in natural languages, they have to convert these qualitative sentences into quantitative data if they want to classify websites using statistical models. To address this issue, this study applies statistical analysis to website information written in natural languages. We apply a dictionary of neologisms in order to decompose website sentences into words and create a dataset of {0, 1} indicator matrices to classify the websites. From the dataset, we estimate the topics of each website using latent Dirichlet allocation. Finally, we discuss how to apply the results obtained to optimize ad strategies.
引用
收藏
页码:538 / 544
页数:7
相关论文
共 50 条
  • [41] Max-Margin Latent Dirichlet Allocation for Image Classification and Annotation
    Wang, Yang
    Mori, Greg
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [42] Part of Speech Features for Sentiment Classification based on Latent Dirichlet Allocation
    Usop, Eka Surya
    Isnanto, R. Rizal
    Kusumaningrum, Retno
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, COMPUTER, AND ELECTRICAL ENGINEERING (ICITACEE), 2017, : 31 - 34
  • [43] Scene classification using class-supervised local-space-constraint latent Dirichlet allocation
    Chao Huang
    Wang Luo
    Multimedia Tools and Applications, 2016, 75 : 10227 - 10240
  • [44] Multilayer classification of web pages using Random Forest and semi-supervised Latent Dirichlet Allocation
    Sayadi, Karim
    Bui, Quang Vu
    Bui, Marc
    2015 15TH INTERNATIONAL CONFERENCE ON INNOVATIONS FOR COMMUNITY SERVICES (I4CS), 2015,
  • [45] A Machine Learning Framework for Document Classification by Topic Recognition Using Latent Dirichlet Allocation and Domain Knowledge
    Lavanya, B.
    Vageeswari, U.
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 1, 2023, 473 : 509 - 520
  • [46] Scene classification using class-supervised local-space-constraint latent Dirichlet allocation
    Huang, Chao
    Luo, Wang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (17) : 10227 - 10240
  • [47] Learning and Using Context on a Humanoid Robot Using Latent Dirichlet Allocation
    Celikkanat, Hande
    Orhan, Guner
    Pugeault, Nicolas
    Guerin, Frank
    Sahin, Erol
    Kalkan, Sinan
    FOUTH JOINT IEEE INTERNATIONAL CONFERENCES ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (IEEE ICDL-EPIROB 2014), 2014, : 201 - 207
  • [48] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [49] Video fingerprinting using Latent Dirichlet Allocation and facial images
    Vretos, Nicholas
    Nikolaidis, Nikos
    Pitas, Ioannis
    PATTERN RECOGNITION, 2012, 45 (07) : 2489 - 2498
  • [50] Mining Sentiments from Songs Using Latent Dirichlet Allocation
    Sharma, Govind
    Murty, M. Narasimha
    ADVANCES IN INTELLIGENT DATA ANALYSIS X: IDA 2011, 2011, 7014 : 328 - 339