Help on chemometrics in NIR
enCalculation of SEC for PLS calibrations, n or n-k-1 as DF?
Hi
<p>Browsing through text books, norms, standards and this forum, consensus seems to be that for PLS models the SEC is calculated as</p>
<p>sec = sqrt(sum(y_i - y_i^hat)^2/df)</p>
<p>where df = n-k-1 (n=number of calibration samples, k=PLS factors, 1 if mean centered)</p>
<p>However when comparing to pupular software products, it seems consensus is to use df=n.</p>
<p>Unscrambler documents this in their Technical Reference, but as far as I have found this appears to be the standard.</p>
<p>The ISO 12099:2010 defines SEC as: "for a calibration model, an expression of the average difference between predicted and reference values for<br />
samples used to derive the model<br />
NOTE As for definitions C.3.4 to C.3.7, in this statistic, this expression of the average difference refers to the square<br />
root of the sum of squared residual values divided by the number of values corrected for degrees of freedom, where 68 %<br />
of the errors are below this value."</p>
<p>My interpretation of this definition is to use df=n, otherwise he 68% condition is not met.</p>
<p>Academic papers seem to agree that one PLS factor > 1 df, but differ in how to find that number bigger than 1.</p>
<p>So it appears the theoretical/statistical correct way is to use df=n-k-1; but that practical implementations use df=n.</p>
<p>Does anyone know why that is?</p>
<p>I have considered if it has to do with terminology, RMSEC vs SEC, but came to the conclusion that is not the case. In any case if anyone could enlioghten me it would be highly appreciated.</p>
<p>I might have missed an important point, whcih I would be happy to be made aware of.</p>
<p>Stay safe and enjoy the holidays ahead.</p>
<p>/jakob</p>
Wed, 16 Dec 2020 14:08:51 +0000
Dear All,
<p>I am investigating the starch and non-starch polysacharide (NSP) composition of grains, therefore I recently started working with IR and chemometrics. After a lot of trail-and-error (preprocessing of spectra, chemometrics software, PCA, etc.), I was delighted to see that during PCA some presumed clusters were formed.</p>
<p>To understand "why" some clusters were formed in PCA score plots, I turned my attention to PC loadings. Unfortunately, here I got slightly lost in interpretation. PC loadings can of course contain both positive and negative values. Also, I mainly used 2Der spectra for PCA, so every band from the original data is represented by 3 bands in the PC loadings.</p>
<p>So far I just picked a couple of large bands on both the positive and negative side of the loading, and tried to interpret/assign them. Since I am dealing with different polysacharides, each of them having complex IR spectra, I would like to interpret the PC loadings as rationally as possible. In the future I would like to purchase/prepare reference polysacharide compounds, but for now the focus is on obtaining as much useful information as possible from the PC loadings.</p>
<p>Is there a certain procedure to interpret such PC loadings from 2Der data? E.g., picking the largest bands (how many, typically?) or bands with certain characteristics, and assuming them to be "central" band of 3 in the 2Der spectra? Are there any data processing techniques for 2Der PC loadings, highlighting the most crucial regions?</p>
<p>Also during PCA of 1Der spectra I obtained useful groupings. Similar to my previous question, are there certain established procedures to interpret PC loadings from 1Der data?</p>
<p>Any help will be appreciated!</p>
<p>Johannes</p>
<p>P.S.: If this discussion fits better in the "Spectroscopy" Section, please feel free to move it there!</p>
</div></div></div>Wed, 05 Aug 2020 10:45:04 +0000jvrijdag30254 at http://impublications.co.ukhttp://impublications.co.uk/forum/procedures-interpretation-2der1der-loadings#commentsQuotient regression (NR), questions and data sets
Dear friends in NIR spectroscopy,

Data sets:
David Hopkins gave in NIR news 2016 a nice introduction to Norris regression. He uses two wheat data sets "WheglA" and "WheglB", which seem to originate from Karl's Lab.

Emails to the contact address of David from the article bounce. Does anyone have these two datasets and can provide them for me?

Questions:
Does anyone know whether Karl used any kind of scatter correction prior to derivatives? If I read carefully my perception is the only modifcation to the absorption data used are derivatives.

 

Yours

Peter
<p>Data sets:<br />
David Hopkins gave in NIR news 2016 a nice introduction to Norris regression. He uses two wheat data sets "WheglA" and "WheglB", which seem to originate from Karl's Lab.</p>
<p>Emails to the contact address of David from the article bounce. Does anyone have these two datasets and can provide them for me?</p>
<p>Questions:<br />
Does anyone know whether Karl used any kind of scatter correction prior to derivatives? If I read carefully my perception is the only modifcation to the absorption data used are derivatives.</p>
<p> </p>
<p>Yours</p>
<p>Peter</p>
</div></div></div>Sun, 27 Oct 2019 12:54:08 +0000ptillmann30243 at http://impublications.co.ukhttp://impublications.co.uk/forum/quotient-regression-nr-questions-and-data-sets#commentsDefinition of bias in ISO 12099
<div class="field field-name-taxonomy-forums field-type-taxonomy-term-reference field-label-above"><div class="field-label">Forums: </div><div class="field-items"><div class="field-item even"><a href="/forums/chemometrics">Chemometrics</a></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Hello,<br />
(if there is still someone awake in this forum.)</p>
<p>In ISO 12099:2016 the definition of bias has changed (compared to ISO 12099:2007):</p>
<p>It was bias = sum (NIRS - ref) / n (2007),</p>
<p>it is bias = sum (ref - NIRS) / n (2016).</p>
<p>What seems to be a small change, is in fact truning the world upside down (I don't assume.our Australian hosts for NIR 2019 caused it.)</p>
<p>In my understanding a NIRS method with a bias of +1 unit, results in NIRS values 1 unit above the targeted reference level. But the new formular changes this. My understanding is supported by third party / third industry literature.</p>
<p>Anybody has noticed this change?<br />
Anybody knows why it was changed?</p>
<p>Peter</p>
</div></div></div>Tue, 03 Sep 2019 16:27:38 +0000ptillmann30242 at http://impublications.co.ukhttp://impublications.co.uk/forum/definition-bias-iso-12099#commentsPLS2 vs PLS1
Hi all,

I am developing a pls model to predict a multivariate Y set.

The pls1 algorithm outperforms pls2 quite a lot. I was told this is common, but struggling to understand why. 

Could someone explain this to me, in maths? P.S., I used the default SIMPLS algorithm.

 

Thanks
<p>I am developing a pls model to predict a multivariate Y set.</p>
<p>The pls1 algorithm outperforms pls2 quite a lot. I was told this is common, but struggling to understand why. </p>
<p>Could someone explain this to me, in maths? P.S., I used the default SIMPLS algorithm.</p>
<p>Thanks</p>
</div></div></div>Fri, 24 Aug 2018 22:25:10 +0000duqqud30219 at http://impublications.co.ukhttp://impublications.co.uk/forum/pls2-vs-pls1#commentsPLS validation option using Thermo Method Generator software
<div class="field field-name-taxonomy-forums field-type-taxonomy-term-reference field-label-above"><div class="field-label">Forums: </div><div class="field-items"><div class="field-item even"><a href="/forums/chemometrics">Chemometrics</a></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Dear all,</p>
<p>Since few time I have to use the <strong>Method Generator </strong>software to build PLS model on a handheld NIR. The software is easy to use, for someone wich a basic chemometrics background. However, when we start to see in detail the calculations, it is not easy to find this information.</p>
<p>My question concerns the last step to generate the PLS model file, if you are familiar with the software you know that there is an option in the last Model parameter pop-up windows called PLS Validation (see image attached).</p>
<p>This option if I well understood has same aim than HotelingT2 and Xresiduals. In this software the statistics used are colled "scores (stdev)" and "Resid (stdev)". Based on that I understand it is the Standard deviation of scores (which ones?) and standard deviation of residuals. However, the default values are quite far of values I expected, for example software propose 5 for scores (stdev) and 15 for Resid(stdev), which is quite huge compared to my results in my data set. Of course depending of these limits the predicted spectra can be considered as valid or invalid, so they are quite important paramenter.</p>
<p>I tried to find how these values are calculated, but I dind not find any literature explaining how these values are calculated.</p>
<p>Hope someone in the community can help me to understand how these values are calculated, because in the Thermo user manual and other information there is nothing about.</p>
<p>Thank you in advance</p>
<p>Best Regards,</p>
<p>Juan G.</p>
</div></div></div><div class="field field-name-field-uploaded-images field-type-image field-label-above"><div class="field-label">Uploaded Images: </div><div class="field-items"><div class="field-item even"><img src="http://impublications.co.uk/sites/default/files/forum/Capture.PNG" width="423" height="92" alt="" title="Screenshot" /></div></div></div>Tue, 08 May 2018 20:10:25 +0000JuanG30216 at http://impublications.co.ukhttp://impublications.co.uk/forum/pls-validation-option-using-thermo-method-generator-software#comments[Question] Pre-processing internal standard
Hello everyone,
Software OPUS has an option to select which pre-processing to use. One of those pre-processings is "Internal Standard".
My question is: does anyone know what kind of pre-processing this is? I mean, the mathematics behind it, because I cannot find anything online related to this pre-processing.
Thanks!
Software OPUS has an option to select which pre-processing to use. One of those pre-processings is "Internal Standard".<br />
My question is: does anyone know what kind of pre-processing this is? I mean, the mathematics behind it, because I cannot find anything online related to this pre-processing.<br />
Thanks!<br />
</p>
</div></div></div>Thu, 25 Jan 2018 18:24:13 +0000miguelG30213 at http://impublications.co.ukhttp://impublications.co.uk/forum/question-pre-processing-internal-standard#commentsOutlier detection using mahalanobis [Question]
<div class="field field-name-taxonomy-forums field-type-taxonomy-term-reference field-label-above"><div class="field-label">Forums: </div><div class="field-items"><div class="field-item even"><a href="/forums/chemometrics">Chemometrics</a></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Hello everyone,<br />
<br />
Sorry if my question is too newby, but I have been debating over a problem that I have.<br />
I want to predict outliers and I have been using software Quant from OPUs (bruker) to sort the outliers for me. For the construction of calibration and predictive models I use /Toolbox for matlab.<br />
My question is: what is the mathematical formula for outlier detection in NIR spectra using mahalanobis distance with PLS?<br />
Can you please explain with some detail because I have reasearched in books and papers and tried many ways but none seem to work (when compared to the values obtained by the software OPUS), maybe I am missing something...<br />
<br />
Any help is appreciated!</p>
</div></div></div>Mon, 05 Jun 2017 16:15:54 +0000miguelG30063 at http://impublications.co.ukhttp://impublications.co.uk/forum/outlier-detection-using-mahalanobis-question#commentsdata interpretation of TERMO MICROPHAZIR
Hi all.
I' m new user in NIR development, and I'm using the Method Generator Software for analyze the data of Microphazir equipment. This give me the following results (picture), and I've some question for their interpretation 
RMSE= RMSEP?
and the values the Slope and Offset belong to the coefficients of line equations?
For example the picture 
Y= 23.30146 + 0.7118447X + 0.1922865 (error) 
is this correct?

 Thank in advance

 Jorge
I' m new user in NIR development, and I'm using the Method Generator Software for analyze the data of Microphazir equipment. This give me the following results (picture), and I've some question for their interpretation <br />
RMSE= RMSEP?<br />
and the values the Slope and Offset belong to the coefficients of line equations?<br />
For example the picture <br />
Y= 23.30146 + 0.7118447X + 0.1922865 (error) <br />
is this correct?</p>
<p> Thank in advance</p>
<p> Jorge<br />
</div></div></div><div class="field field-name-field-uploaded-images field-type-image field-label-above"><div class="field-label">Uploaded Images: </div><div class="field-items"><div class="field-item even"><img src="http://impublications.co.uk/sites/default/files/forum/20170403_121415%5B1%5D_opt.jpg" width="400" height="225" alt="" /></div></div></div>Mon, 03 Apr 2017 15:34:19 +0000ForestBiotech29968 at http://impublications.co.ukhttp://impublications.co.uk/forum/data-interpretation-termo-microphazir#commentsAdvise on ANN method
Dear all,
I just develop calibration model on WINISI and by PLS algorithm. Now I would like to learn more about ANN method and another software. Can you give some advice about:
- What softwares I should consider to use (advantages, disadvantages?, lincense fee?, anual fee?)
- How many samples is enough to build a non-linear model?
- Some sources are available to learn these software?
Thanks in advance for your help,
Nlt
I just develop calibration model on WINISI and by PLS algorithm. Now I would like to learn more about ANN method and another software. Can you give some advice about:<br />
- What softwares I should consider to use (advantages, disadvantages?, lincense fee?, anual fee?)<br />
- How many samples is enough to build a non-linear model?<br />
- Some sources are available to learn these software?<br />
Thanks in advance for your help,<br />
Nlt</p>
</div></div></div>Thu, 16 Feb 2017 03:30:01 +0000Nhi29964 at http://impublications.co.ukhttp://impublications.co.uk/forum/advise-ann-method#comments