Analysing chromatographic data using data mining to monitor petroleum content in water

被引:0
|
作者
Holmes, Geoffrey [1 ]
Fletcher, Dale [1 ]
Reutemann, Peter [1 ]
Frank, Eibe [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
Gas Chromatography Mass Spectrometry; GC-MS; BTEX; Data Mining; Model Trees; Regression; Data Preprocessing; Correlation Optimized Warping; Petroleum Monitoring;
D O I
10.1007/978-3-540-88351-7_21
中图分类号
F [经济];
学科分类号
02 ;
摘要
Chromatography is an important analytical technique that has widespread use in environmental applications. A typical application is the monitoring of water samples to determine if they contain petroleum. These tests are mandated in many countries to enable environmental agencies to determine if tanks used to store petrol are leaking into local water systems. Chromatographic techniques, typically using gas or liquid chromatography coupled with mass spectrometry, allow an analyst to detect a vast array of compounds-potentially in the order of thousands. Accurate analysis relies heavily on the skills of a limited pool of experienced analysts utilising semi-automatic techniques to analyse these datasets-making the outcomes subjective. The focus of current laboratory data analysis systems has been on refinements of existing approaches. The work described here represents a paradigm shift achieved through applying data mining techniques to tackle the problem. These techniques are compelling because the efficacy of preprocessing methods, which are essential in this application area, can be objectively evaluated. This paper presents preliminary results using a data mining framework to predict the concentrations of petroleum compounds in water samples. Experiments demonstrate that the framework can be used to produce models of sufficient accuracy-measured in terms of root mean squared error and correlation coefficients-to offer the potential for significantly reducing the time spent by analysts on this task.
引用
收藏
页码:278 / 290
页数:13
相关论文
共 50 条
  • [41] ANALYSING THE USERS' PERCEPTION OF WEB DESIGN QUALITY BY DATA MINING TOOLS
    Bevanda, Vanja
    Grzinic, Jasmina
    Cervar, Emanuel
    TOURISM AND HOSPITALITY MANAGEMENT-CROATIA, 2008, 14 (02): : 251 - 262
  • [42] Mining bus travel card data for analysing mobilities in tourist regions
    Domenech, Antoni
    Miravet, Daniel
    Gutierrez, Aaron
    JOURNAL OF MAPS, 2020, 16 (01): : 40 - 49
  • [43] Using MVFEWNN for data mining
    Zheng, JG
    Liu, F
    Jiao, LC
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 1233 - 1236
  • [44] Vegetation Water Content Estimation Using Hyperion Hyperspectral Data
    Yuan, Jinguo
    Sun, Kaijun
    Niu, Zheng
    2010 18TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS, 2010,
  • [45] TaCbF- "Trending Architecture for Content based Filtering using Data Mining"
    Karthik, V
    Choudhary, Savita
    2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 417 - 420
  • [46] Big-data Integration Methodologies for Effective Management and Data Mining of Petroleum Digital Ecosystems
    Nimmagadda, Shastri L.
    Dreher, Heinz V.
    2013 7TH IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES (DEST), 2013, : 148 - 153
  • [47] Analysing flight data using clustering methods
    Jesse, Christopher
    Liu, Honghai
    Smart, Edward
    Brown, David
    KNOWLEDGE - BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2008, 5177 : 733 - 740
  • [48] Analysing potential field data using visibility
    Cooper, GRJ
    COMPUTERS & GEOSCIENCES, 2005, 31 (07) : 877 - 881
  • [49] Using Correlation Dimension for Analysing Text Data
    Kivimaki, Ilkka
    Lagus, Krista
    Nieminen, Ilari T.
    Vayrynen, Jaakko J.
    Honkela, Timo
    ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT I, 2010, 6352 : 368 - 373
  • [50] Data mining using the data portal for the NeSSI project
    Drinkwater, G
    Sufi, S
    GCA '05: Proceedings of the 2005 International Conference on Grid Computing and Applications, 2005, : 161 - 167