Alphabet Soup or Calibration Acronyms

As process spectroscopy has grown, so too has the number of different acronyms associated with the measurement methods and associated mathematics, not to mention the acronyms for conferences and scientific organizations. For the newcomer this can be quite daunting to try to digest the alphabet soup from presentations and papers detailing different applications of process spectroscopy. The purpose of this blog is to just give a list of relevant acronyms related to calibration that are commonly encountered and a short definition where relevant. The list below is in alphabetical order. This glossary represents the most popular data analysis terms you may use during your conversations about using process spectroscopy.

In reality, PLS, and MLR are used most of the time in NIR applications. These are the two calibration methods that Guided Wave uses in all of their analyzer applications. But there are other techniques, we hope this brief list will help to make conversations easier to follow.

Calibration and Regression Methods Acronyms

ANN – Artificial Neural Networks An Artificial Neural Network (ANN) is an information processing method that is inspired by the way biological nervous systems process information. An ANN is composed of a large number of highly interconnected processing elements (neurons) working together to solve specific problems. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning involves adjustments to the connections that exist between the neurons. In more practical terms neural networks are non-linear statistical data

CLS – Classical Least Squares CLS (the K Matrix method) is a regression method that assumes Beer’s Law applies – i.e. that absorbance at each wavelength is proportional to component concentration. A model generated using CLS in its simplest form, requires that all interfering chemical components be known and included in the calibration data set.

ILS – Inverse Least Squares ILS (the P Matrix method) is a regression method that applies the inverse of Beer’s Law. It assumes that component concentration is a function of absorbance. An ILS model has a significant advantage over CLS in that it does not need to know and include all components in the calibration set.

kNN – k Nearest Neighbor kNN is a classification scheme where a Euclidian distance metric is used to determine the classification. The distance metric calculated for an unknown sample is an indication of the degree of similarity to other samples.

LWR – Locally Weighted Regression In locally weighted regression, sample points are weighted by their proximity to the current sample point in question. A regression model is then computed using the weighted points. In some cases LWR models can produce better accuracy.

MCR – Multivariate Curve Resolution Multivariate Curve Resolution is a group of techniques that can be used to resolve mixtures by determining the number of constituents present and what their individual response profiles (spectra, pH profiles, time profiles, elution profiles) look like. It also provides an estimate of the concentrations. This can all be done with no prior information about the nature and composition of the mixtures.

MLR – Multiple Linear Regression MLR is a regression method for relating the variations in a response variable (concentrations or properties) to the variations of several predictors (spectral data). The goal is to be able to measure the spectral data on future samples and predict the concentrations or properties. One requirement for MLR is that the predictor variables (spectral data) must be linearly independent.

Instrument and Technology Acronyms

NIR-O – Guided Wave’s next generation spectrometer and an evolutionary step-up from the M412, NIR-O stands for Near InfraRed Online process analyzer. NIR-O is suitable for online analyses of most processes and process streams. Having the built-in capacity to add more sampling points (up to 12 total channels) within the same process or across processes, in any combination, gives users the flexibility to invest in exactly the capacity they require now. It also minimizes investment for any expansion users may want in the future. NIR-O operates in the xNIR range of 1000-2100nm, using
process-proven TE-cooled InGaAs detector technology.

FT-NIR – An alternative to dispersive spectrometers, Fourier transform spectroscopy is an effective tool for lab analysis.

DG-NIR – Disperive grating technology was developed over 100 years ago and is the defacto standard for real time monitoring of in-situ process conditions.

Statistical and Mathematical Acronyms

OSC – Orthogonal Signal Correction Orthogonal signal correction is a technique originally developed and used for spectral data to remove variation that is orthogonal (non-correlated) to a particular parameter of interest. This is one way to remove interferences from spectral data prior to calibration.

PC – Principal Component /
PCA – Principal Component Analysis Principal component analysis (PCA) is a bi-linear modeling method that involves a mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.

PCR – Principal Component Regression In PCR the PCA is taken one step further and a regression between the principal components and one or more response variables (concentrations or properties) is performed. A PCR model can then be used to predict concentrations or properties for unknown samples.

PLS – Partial Least Squares Partial Least Squares Regression is a bilinear modeling method where information in the original X variables (spectral data) is projected onto a small number of underlying “latent” variables called PLS components. The Y variables (concentrations or properties) are used in estimating the “latent” variables to ensure that the first components are those that are most relevant for predicting the Y-variables. Interpretation of the relationship between the X and Y variables is then simplified as this relationship is concentrated on the smallest possible number of components.

RMSEC – Root Mean Square Error of Calibration
RMSEP – Root Mean Square Error of Prediction
RMSEPcv – Root Mean Square Error of Prediction based on Cross Validation
SEC – Standard Error of Calibration
SEP – Standard Error of Prediction

These are all terms that are used to evaluate the performance of calibrations. The SEP terms are indications of how accurate a calibration model will be in predicting future samples. They are calculated using predicted results from true unknown samples. The RMSEP is an average expected prediction error. This differs slightly from the SEC terms that are providing the prediction error for the calibration samples used in developing the model. The relationship between RMSEP and SEP (RMSEC and SEC) is RMSEP2 = SEP2 + bias2

SVM – Support Vector Machines Support Vector Machines are a set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classifiers. These methods are finding their way into calibration programs and have shown great promise in their power to minimize prediction error for complex calibrations.

ANN – Artificial Neural Networks
CLS – Classical Least Squares
ILS – Inverse Least Squares
kNN – k Nearest Neighbor
LR – Linear Regression
LS – Least Squares
LWR – Locally Weighted Regression
MCR – Multivariate Curve Resolution
MLR – Multiple Linear Regression
OSC – Orthogonal Signal Correction
PC – Principal Component
PCA – Principal Component Analysis
PCR – Principal Component Regression
PLS – Partial Least Squares
RMSEC – Root Mean Square Error of Calibration
RMSEP – Root Mean Square Error of Prediction
RMSEPcv – Root Mean Square Error of
Prediction based on Cross Validation
SEC – Standard Error of Calibration
SEP – Standard Error of Prediction (Performance)
SVD – Singular Value Decomposition
SVM – Support Vector Machines