[关键词]
[摘要]
为了探寻苹果产地溯源新方法,本文以新疆阿克苏、陕西洛川、山东烟台三个不同产地600个红富士苹果样本为试材分别采集其590~1250 nm的近红外透射光谱图,将经过光谱校正后的光谱数据做归一化(Normalization)、多元散射校正(Multivariate Scattering Correction,MSC)等8种光谱预处理,对经过预处理后的光谱数据建立全波长分类模型发现二阶求导为最优预处理方法;然后再将经过二阶求导预处理的光谱数据结合以欧几里得距离(Euclidean)、相关距离(correlation)、余弦相似度(cosine)、城市街区距离(cityblock)作为距离度量的K最近邻法建模,发现K最近邻法(correlation)为最优分类识别方法;随后再分别用高斯过程隐变量模型(Gaussian Process Latent Variable Model,GPLVM)、线性局部切空间排列(Linear Local Tangent Space Alignment,LLTSA)、等12种数据降维方法对经二阶求导预处理后的光谱做降维处理,并结合K最近邻法(correlation)鉴别苹果产地。结果表明,提取前9个主成分,采用二阶求导-扩散映射-K最近邻法(correlation)模型识别效果最优,建模集和预测集的分类识别率分别为97.30%与92.30%。故,深度学习数据降维方法结合近红外透射光谱技术可成功、有效地实现苹果产地溯源。
[Key word]
[Abstract]
In order to find a new method of tracing apple origin, in this work, 600 apple samples from three different producing areas of Aksu, Luochuan, and Yantai were used to collect the near-infrared transmission spectra within 590~1250 nm, then spectroscopically corrected spectrum were treated by eight species of spectral pretreatment such as normalization, standard normal variate transformation, multivariate scattering correction, savitzky-golay smoothing, 2nd derivative, mean centering, moving average, 1st derivative. Firstly, established full spectra classification model for pre-processed spectral data showed that 2nd derivative was the best pre-processing method. Secondly, data set preprocessed by 2nd derivative were used to combine 4 different KNN models (Euclidean, correlation, cosine, cityblock) to do pattern recognition, which was found that K-nearest neighbor method (correlation) was the best classification and recognition method. Thirdly, factor analysis, gaussian process latent variable model, linear local tangent space alignment, neighborhood components analysis, neighborhood preserving embedding, diffusion maps, t-distributed stochastic neighbor embedding, landmark isomap, laplacian eigenmaps, locally linear embedding, principal component analysis, linear discriminant analysis were used to reduce the dimension of the spectrum after 2nd derivative pretreatment, and then combining K-nearest neighbor was combined to trace the origin of apple. Results showed that an optimal identification model was obtained by using 2nd derivative-diffusion maps-KNN (correlation). The identification rates for the calibration set and prediction set were 97.3% and 92.3%, respectively. Therefore, the deep learning dimension reduction methods combined with near-infrared transmission spectroscopy could successfully and effectively discriminate the traceability of apple origin.
[中图分类号]
[基金项目]
国家自然科学基金项目(61367001)