Neurotoxicity Prediction Model of Food Contaminants Based on Machine Learning
Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials
Abstract:
Prediction models based on machine learning algorithms are established to predict the neurotoxicity of chemical pollutants in food. Database which includes fifty-seven neurotoxic compounds and fifty non-neurotoxic compounds was established through the published paper. By utilizing the R and SPSS software, the random forest (Random Forests, RF), neural network (Artificial Neural Network, ANN), support vector machine (Support Vector Machine, SVM) and other algorithms were used to build the classification models applying the molecular descriptors. The random forest algorithm represented the best performance in aspects of total accuracy and feasibility, illustrating total accuracy of training set and test set was 95.51% and 83.33%, respectively. The area under the curves of training set and test set were 0.99 and 0.85, respectively. The accuracy of 10-fold cross-validation was 70.24%. In this study, the prediction models established on the basis of machine learning algorithms and chemical informatics can accurately distinguish the neurotoxicity compounds from non-neurotoxicity compounds. Our result suggested that among models, the one constructed with the random forest algorithm performs better and the highest eigenvalue from Burden matrix as a molecular descriptor contributes dominantly to the classification of chemicals with neurotoxic potential.