首页  |  本刊简介  |  编委会  |  投稿须知  |  订阅与联系  |  微信  |  出版道德声明  |  Ei收录本刊数据  |  封面
基于遗传-支持向量机和遗传-径向基神经网络的有机物正辛醇-水分配系数QSPR研究
摘要点击 1740  全文点击 1242  投稿时间:2007-02-06  修订日期:2007-04-13
查看HTML全文 查看全文  查看/发表评论  下载PDF阅读器
中文关键词  定量结构-性质相关(QSPR)  正辛醇-水分配系数(kow)  遗传算法(GA)  支持向量机(SVM)
英文关键词  quantitative structure-property relationship (QSPR)  N-octanol-water partition coefficients (kow)  genetic algorithms (GA)  support vector machine (SVM)
作者单位
齐珺 北京师范大学环境学院水环境模拟国家重点实验室,北京 100875 
牛军峰 北京师范大学环境学院水环境模拟国家重点实验室,北京 100875 
王丽莉 北京师范大学环境学院水环境模拟国家重点实验室,北京 100875 
中文摘要
      基于遗传算法(GA)的因子筛选和支持向量机(SVM)的非线性回归,提出了1种改进的有机物定量结构-性质相关(QSPR)建模方法——遗传-支持向量机(GA-SVM),并将其用于38种食品工业常用有机物正辛醇-水分配系数(kow)的QSPR建模.结果显示,QSPR模型选取了分子量、Hansen极性、沸点、含氧率和含氢率5种参数;模型的预测值与实测值间的误差平方和(SSE)、均方差(RMSE)和决定系数(r2)分别为0.048、0.036和0.999,表明模型具有较强的预测能力;同时,交叉验证的结果(SSE=0.295,RMSE=0.089,R2=0.995)也表明,模型具有良好的稳健性,因此,GA-SVM算法适用于对有机物正辛醇-水分配系数的QSPR建模.此外,将基于GA-SVM的QSPR模型分别与基于遗传-径向基神经网络(GA-RBFNN)和基于线性算法的模型进行了比较,结果表明,应用GA-SVM建立的QSPR模型无论从稳健性还是预测能力上都优于应用其它2种算法建立的模型,因此,GA-SVM算法比GA-RBFNN和线性算法更适合于对有机物正辛醇-水分配系数进行QSPR建模.
英文摘要
      A modified method to develop quantitative structure-property relationship (QSPR) models of organic compounds was proposed based on genetic algorithm (GA) and support vector machine (SVM) (GA-SVM). GA was used to perform the variable selection, and SVM was used to construct QSPR models. GA-SVM was applied to develop the QSPR models for N-octanol-water partition coefficients (kow) of 38 typical organic compounds in food industry. 5 descriptors (molecular weights, Hansen polarity, boiling point, percent oxygen and percent hydrogen) were selected in the QSPR model. The coefficient of multiple determination (r2), the sum of squares due to error (SSE) and the root mean squared error (RMSE) values between the measured values and predicted values of the model developed by GA-SVM are 0.999, 0.048 and 0.036, respectively, indicating good predictive capability for lgkow values of these organic compounds. Based on leave-one-out cross validation, the QSPR model constructed by GA-SVM showed good robustness (SSE=0.295,RMSE=0.089,r2=0.995). Moreover, the models developed by GA-SVM were compared with the models constructed by genetic algorithm-radial basis function neural network (GA-RBFNN) and linear method. The models constructed by GA-SVM show the optimal predictive capability and robustness in the comparison, which illustrates GA-SVM is the optimal method for developing QSPR models for lgkow values of these organic compounds.

您是第53256713位访客
主办单位:中国科学院生态环境研究中心 单位地址:北京市海淀区双清路18号
电话:010-62941102 邮编:100085 E-mail: hjkx@rcees.ac.cn
本系统由北京勤云科技发展有限公司设计  京ICP备05002858号-2