Application of Genetic Programming (GP) in Prediction of Gas Chromatographic Retention Time of some Pesticides
More details
Hide details
University of Mazandaran
Mohammad Hossein Fatemi   

University of Mazandaran, Faculty of Chemistry, 47416 Babolsar, Iran
Online publication date: 2017-09-27
Publication date: 2017-09-27
Eurasian J Anal Chem 2017;12(7):1001–1014
In this study, quantitative structure–retention relationship (QSRR) methodology was employed for modeling of gas chromatographic retention time for 74 pesticides. Stepwise multiple linear regression (SW-MLR) was used for the selection of most important descriptors. Multiple linear regression (MLR) and genetic programming (GP) were utilized to develop linear and symbolic regression equation models, respectively. Inspection to statistical parameters of developed MLR and GP models indicates symbolic regression equation via GP can be selected as the best fitted model. For this model, the square correlation coefficients (R2) were 0.943 and 0.911, and the root-mean square errors (RMSE) were 2.56 and 2.77 for the training and test sets, respectively. The built GP model was assessed by leave one out cross-validation (Q2cv = 0.79, SPRESS = 2.57) as well as external validation. In addition, the result of sensitivity analysis of GP model suggest structural features and polarity are important factors responsible for gas-chromatographic retention time values of studied pesticides.
Cooper, J., & Dobson, H. (2007). The benefits of pesticides to mankind and the environment. Crop Protection, 26(9), 1337-1348.
Oerke, E.-C., & Dehne, H.-W. (2004). Safeguarding production—losses in major crops and the role of crop protection. Crop protection, 23(4), 275-285.
Van der Hoff, G. R., & van Zoonen, P. (1999). Trace analysis of pesticides by gas chromatography. Journal of Chromatography A, 843(1), 301-322.
Ahmed, F. E. (2001). Analyses of pesticides and their metabolites in foods and drinks. TrAC Trends in Analytical Chemistry, 20(11), 649-661.
Davis, J. R., Brownson, R. C., & Garcia, R. (1992). Family pesticide use in the home, garden, orchard, and yard. Archives of environmental contamination and toxicology, 22(3), 260-266.
Jaga, K., & Dharmani, C. (2003). Sources of exposure to and public health implications of organophosphate pesticides. Revista panamericana de salud pública, 14(3), 171-185.
Wang, N., et al. (2012). Simultaneous determination of pesticides, polycyclic aromatic hydrocarbons, polychlorinated biphenyls and phthalate esters in human adipose tissue by gas chromatography–tandem mass spectrometry. Journal of Chromatography B, 898, 38-52.
Boxall, R. (2001). Post-harvest losses to insects—a world overview. International Biodeterioration & Biodegradation, 48(1), 137-152.
Watts, C. et al. (1989). Pesticides: analytical requirements for compliance with EEC directives. Water Pollut. Res, 16-34.
Pereira, J. L. et al. (2009). Toxicity evaluation of three pesticides on non-target aquatic and soil organisms: commercial formulation versus active ingredient. Ecotoxicology, 18(4), 455-463.
Hallberg, G. R. (1989). Pesticides pollution of groundwater in the humid United States. Agriculture, ecosystems & environment, 26(3), 299-367.
Leistra, M., & Boesten, J. (1989). Pesticide contamination of groundwater in western Europe. Agriculture, ecosystems & environment, 26(3), 369-389.
Klaine, S. et al. (1988). Characterization of agricultural nonpoint pollution: Pesticide migration in a west Tennessee watershed. Environmental toxicology and chemistry, 7(8), 609-614.
Aharonson, N. (1987). Potential contamination of ground water by pesticides. Pure and applied chemistry, 59(10), 1419-1446.
Gilliom, R. J. et al. (2006). Pesticides in the nation's streams and ground water, 1992-2001. Geological Survey (US).
Rathore, H. S., & Nollet, L. M. (2012). Pesticides: Evaluation of environmental pollution: CRC Press.
Ali, U. et al. (2014). Organochlorine pesticides (OCPs) in South Asian region: a review. Science of the Total Environment, 476, 705-717.
Junk, G., & Richard, J. (1988). Organics in water: solid phase extraction on a small scale. Analytical Chemistry, 60(5), 451-454.
Zaugg, S. D. et al. (1995). Methods of analysis by the US Geological Survey National Water Quality Laboratory; determination of pesticides in water by C-18 solid-phase extraction and capillary-column gas chromatography/mass spectrometry with selected-ion monitoring, US Geological Survey: Open-File Reports Section/ESIC [distributor].
Kaliszan, R. (1987). Quantitative structure-chromatographic retention relationships.
Kaliszan, R. (2007). QSRR: quantitative structure-(chromatographic) retention relationships. Chemical reviews, 107(7), 3212-3246.
Dearden, J., Cronin, M., & Kaiser, K. (2009). How not to develop a quantitative structure–activity or structure–property relationship (QSAR/QSPR). SAR and QSAR in Environmental Research, 20(3-4), 241-266.
Koza, J. R. (1990). Genetic programming: A paradigm for genetically breeding populations of computer programs to solve problems. Stanford University, Department of Computer Science.
Koza, J. R. (1992). Genetic programming: on the programming of computers by means of natural selection. Vol. 1. MIT press.
Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4(2), 87-112.
Ghasemi, S., Karami, H., & Khanezar, H. (2014). Hydrothermal synthesis of lead dioxide/multiwall carbon nanotube nanocomposite and its application in removal of some organic water pollutants. Journal of Materials Science, 49(3), 1014-1024.
Rouvray, D. H. (1992). Definition and role of similarity concepts in the chemical and physical sciences. Journal of Chemical Information and Computer Sciences, 32(6), 580-586.
Maldonado, A. G. et al. (2006). Molecular similarity and diversity in chemoinformatics: from theory to applications. Molecular diversity, 10(1), 39-79.
HyperChem, H. (2002). Release 7 for windows, HyperCube, Ed.
Yap, C. W. (2011). PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints. Journal of computational chemistry, 32(7), 1466-1474.
Katritzky, A., V. Lobanov, & Karelson, M. (1995). CODESSA: training manual. University of Florida, Gainesville, FL.
Todeschini, R. et al. (2003). DRAGON-Software for the calculation of molecular descriptors. Web version, 3.
Stine, R. A., (1995). Graphical interpretation of variance inflation factors. The American Statistician, 49(1), 53-56.
Koza, J. R. (1990). Concept formation and decision tree induction using the genetic programming paradigm. in International Conference on Parallel Problem Solving from Nature. Springer.
Hrnjica, B. (2011). Skrgic Selection in GPdotNET.
Kramer, R. (1998). Chemometric techniques for quantitative analysis. CRC Press.
Vandeginste, B. G. et al. (1998). Handbook of chemometrics and qualimetrics. Elsevier.
Hrnjica, B. (2013). GPdotNET - artificial intelligence tool.
Wold, S., Eriksson, L., & Clementi, S. (1995). Statistical validation of QSAR results. Chemometric methods in molecular design, 309-338.
Atkinson, A. C. (1985). Plots, transformations, and regression: an introduction to graphical methods of diagnostic regression analysis. [519.536 A875].
Miller, D. (1974). Model validation through sensitivity analysis. in Proceedings of the 1974 Summer Computer Simulation Conference.
Geary, R. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician, 5, 115–146. Detection of Disease Clustering, 389.
Moreau, G., & Broto, P. (1980). The auto-correlation of a topological-structure-a new Molecular Descriptor, Gauthier-Villars 120 Blvd Saint-Germain, 75280 Paris Cedex 06, France, p. 359-360.
Balaban, A. (1979). Chemical Graphs, 34. 5 New Topological Indexes for the Barcnhing of Tree-Like Graphs. Theoretica Chimica Acta, 53(4), 355-375.
Marleau, G., Hébert, A., & Roy, R. (2000). A user guide for DRAGON. Version DRAGON_000331 release 3.04. Report IGE-174 Rev, 5.
Stanton, D. T. (1999). Evaluation and use of BCUT descriptors in QSAR and QSPR studies. Journal of chemical information and computer sciences, 39(1), 11-20.