Probabilistic real time automata;
online induction;
maximum frequent pattern based clustering;
GENE-EXPRESSION;
D O I:
10.1109/ICDM.2012.121
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Probabilistic real time automata (PRTAs) are a representation of dynamic processes arising in the sciences and industry. Currently, the induction of automata is divided into two steps: the creation of the prefix tree acceptor (PTA) and the merge procedure based on clustering of the states. These two steps can be very time intensive when a PRTA is to be induced for massive or even unbounded data sets. The latter one can be efficiently processed, as there exist scalable online clustering algorithms. However, the creation of the PTA still can be very time consuming. To overcome this problem, we propose a genuine online PRTA induction approach that incorporates new instances by first collapsing them and then using a maximum frequent pattern based clustering. The approach is tested against a predefined synthetic automaton and real-world data sets, for which the approach is scalable and stable. Moreover, we present a broad evaluation on a real world disease group data set that shows the applicability of such a model to the analysis of medical processes.
机构:
Univ Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, ArgentinaUniv Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, Argentina
Pavese, Esteban
Braberman, Victor
论文数: 0引用数: 0
h-index: 0
机构:
Univ Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, Argentina
Consejo Nacl Invest Cient & Tecn, RA-1033 Buenos Aires, DF, ArgentinaUniv Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, Argentina
Braberman, Victor
Uchitel, Sebastian
论文数: 0引用数: 0
h-index: 0
机构:
Univ Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, Argentina
Consejo Nacl Invest Cient & Tecn, RA-1033 Buenos Aires, DF, Argentina
Imperial Coll London, London, EnglandUniv Buenos Aires, Dept Comp, RA-1053 Buenos Aires, DF, Argentina