|
March
2002 editorial:
Everything
you need to know about
Orthogonal Signal Correction (OSC) filters
-
and how they can improve interpretation of your data
Johan
Trygg, Ph.D.
Last month's
editorial discussed why Partial Least Squares
Projections to Latent Structures (PLS) sometimes
needs more PLS components than Y-variables.
It was shown that this was due to strong systematic
but irrelevant (Y-orthogonal) variation in X.
It was also shown that if PLS has more than
one component / Y variable, the interpretation
(not prediction) of the PLS model suffers
in direct relation to the additional number
of PLS components needed.
This month, I want to describe a new
set of pre-processing methods that can be used to
remove the systematic Y-orthogonal variation from
X. These methods are the Orthogonal Signal
Correction (OSC) filters. I will describe what they
are, why and when they are useful. As a cliffhanger
to get your attention for next month's editorial,
I will also explain why these OSC filters are not
optimal for the two-block (X-Y) situation, as one
might expect.
You
can also download
this Editorial (pdf)
What/why/how
concerning OSC methods?
Data collected from complicated samples
or in complicated processes contains variation from
many sources and of several types. Pre-processing
methods can be applied in such situations to enhance
the relevant information to make resulting models
simpler and easier to interpret. Orthogonal signal
correction (OSC) filters was developed to remove
strong structured (i.e. systematic) variation
in X that is not correlated to Y.
That is, they remove structured Y-orthogonal variation
from X, in such a way that the filter can
be applied to future data. The OSC filters need
information about Y. If no Y exists,
it may be possible to create it by adding dummy
variables (1/0). The OSC methods can also be used
to remove X-orthogonal variation in Y. Wold
et al.[38] published the original work on OSC. Later,
a number of papers have described alternative OSC
methods [39-42]. Svensson et al.[128] and Goicoechea
et al. [129] have compared OSC related methods.
In the last few years, as more OSC filters
have been reported in the literature, there has been
some confusion because one may intially think that
the different OSC filters produce the same results.
They do not. Also, to add to the confusion, some reported
pre-processing methods are claimed to be OSC filters,
although they are not. Here, I will only discuss reported
methods that are OSC filters, according to the definition
below. A detailed comparison of the different methods
is also given.
The
general OSC model
The general, single component OSC model
of X can be expressed by:
X = toscp'osc
+ E where tosc
= Xwosc , ||wosc||=1
Here tosc
, posc and wosc
represent the OSC component. More than one OSC component
can be identified and removed from X. For additional
OSC components, the filter is applied to the E
matrix. This is the general model for all OSC methods.
The methods differ in the selection of wosc,
resulting in different scores (tosc)
and loading (posc). The
OSC component is similar to the standard PLS component,
as it has two sets of loading vectors. The difference
is that the score vectors are orthogonal to Y.
The OSC filter must fulfill three
requirements, it must:
1.) contain (large) systematic variation
in X;
2.) be predictive using X (in order to be applied
to future data);
3.) be orthogonal to Y;

Figure. Schematic structure of the general OSC
model. Structured Y-orthogonal variation in X
is captured in toscp'osc
and is then removed from X.
Description
of current OSC methods
Two different approaches are used to
construct OSC filters, I have decided to name them
the indirect and the direct approach based on the
way they estimate the OSC component.
The indirect
OSC approach (the original approach)
The
original OSC approach is to, first, find a Y-orthogonal
score vector tosc, and
then use it to find wosc,
usually by means of PCA or PLS/ PCR regression, see
Figure below. Two different strategies can be employed:
a.) Any suitable vector t is orthogonalized
to Y, tosc =t
-Yinv(Y'Y)Y't and
then a multi-component PLS or PCR regression model
is used to predict tosc
from X. The regression coefficient vector of
the model is the wosc
vector. This is the strategy used by Wold et al. and
Sjöblom et al.
b.) Columns in X are first orthogonalized to
Y using a PLS or PCR regression model and then
PCA is performed on the Y-orthogonal X matrix.
The first PCA component gives the Y-orthogonal score
vector tosc. This is
the strategy that we used for POSC (REFXX) and also
by Westerhuis et al. for DOSC.
In order to avoid overfitting, the indirect approach
identifies OSC components that are not strictly Y-orthogonal.
The strict requirement for Y-orthogonality of the
tosc vector may be relaxed
if the structured Y-orthogonal variation in X
is modeled. The tosc
vector can be considered as being systematically Y-orthogonal
when the noise in X and Y is taken into
consideration [38, 42]. In the description of the
different OSC filters below, a single y vector
is considered for simplicity, even though all methods
can handle a Y matrix.
The OSC method, Wold et al.
The OSC [38] method introduced by Wold et al. identifies
the suitable Y-orthogonal vector tosc
through an 'internal' iterative procedure. The initial
tosc is the first PC
score vector t in X, orthogonalized
to y. In each iteration, a PLS model is calculated
with a set number of PLS components. After each iteration,
a convergence check is performed to determine whether
the predicted tosc is
the same as the last orthogonalized tosc.
The main problem associated with this procedure concerns
overfitting the estimated components. Crossvalidation,
or any other validation method, is not usually implemented.
The correct number of PLS components in the 'internal'
PLS model (step 4 in Appendix A of [38]) for estimating
tosc is, therefore, difficult
to determine. This increases the risk of overfitting,
or even degradation, of the resultant calibration
models.
OSC method, Sjöblom et
al.
The OSC method of Sjöblom et al [40] differs
slightly from the approach described by Wold et al.
Instead of orthogonalizing the first PC score vector
t to y, Sjöblom et al. attempts
to find a principal component score vector t
in X, orthogonal to y, directly. This
is done using the iterative NIPALS PCA procedure,
with an added orthogonalization step tosc=
t-(t'y/(y'y))y.
This iteration can only converge (tosc=t)
if the variation of the y component and the
Y-orthogonal variation in X are orthogonal
in both rows and columns in X. Therefore, the
next step calculates a standard PLS model with tosc
as the y-vector. This method has problems similar
to those described for the Wold et al. method.
Direct orthogonal signal correction,
DOSC
The DOSC [42] method finds a least squares estimate
of y from X, so that b=X+y
, where X+ is the Moore-Penrose solution. This
procedure divides y into two parts, one that
can be predicted by X and one that is orthogonal
to X. The second step is to project X
onto y = Xb to determine the loading
vector p = X'y /( y'y),
and then remove the yp' component from X.
E=X- yp'. PCA is used to find
the largest score vector in E, this defines
tosc. Having found tosc,
an 'internal' PCR model is used to predict tosc
from X, to find wosc
(the regression coefficients). However, the absolute
Y-orthogonality requirement needs to be relaxed in
order not to overfit the PCR model. Westerhuis et
al. discuss the requirement of absolute Y-orthogonality.
Projected orthogonal signal
correction, POSC
In [X1], we have independently developed an OSC method
which is similar to DOSC but is more practical. This
method is called 'projected orthogonal signal correction'
(POSC). The first step of the method is to orthogonalize
the columns in X to y using a regular
PLS model between X and y. Here the
method differs from DOSC, which uses the Moore-Penrose
inverse. The PLS regression coefficients are used
to predict y from X; y = y
+f =Xb+f. The second step is
to project X onto y, p = X'y
/( y'y), and remove the yp' component
from X. E=X-yp'. PCA is
used to find the largest score vector in E,
which represents the largest Y-orthogonal component
in E with the corresponding tosc.
Unlike OSC (Wold et al. and Sjöblom et al.) and
DOSC, it is not necessary to calculate another regression
model to predict tosc.
It can be predicted using the corresponding PCA loading,
here denoted wosc. This
is not possible for DOSC, because of the overfit associated
with the Moore-Penrose inverse calculations. POSC
can be seen as a special case of O-PLS [X1].
The direct
OSC approach
Figure.
The direct OSC approach consists of two main steps.
(1) Find the X-Y subspace, WXY=X'Y.
(2) Any row vector in X orthogonalized
to WXY is a potential
wosc vector.
|
|
This approach determines the OSC components
in a more straightforward way and they are guaranteed
to be orthogonal to Y. No 'internal' regression
model is needed, so there are no associated overfit
problems. The direct OSC approach also offers the
possibility of creating specific OSC filters.
The first step is to find the common subspace WXY
of X and Y by WXY=X'Y.
For a single y vector, the normalized wXY
vector is identical to the first PLS loading weight
w. The X-Y subspace is important, because any
vector orthogonal to WXY,
here denoted wosc, will
yield a Y-orthogonal score vector tosc
= Xwosc, Figure below.
This approach is used in OSC method by Fearn et al..
Proof: Any vector p which
is orthogonalized to w=X'y/||y'X||
will yield an Y-orthogonal score vector tosc
= Xwosc:
y'tosc = y'Xwosc
substituting and simplifying
wosc = p - w(w'p/w'w))
||w|| = 1
gives
= y'X(p-ww'p)
substituting
y'X = w'||y'X||
gives
= ||y'X|| (w'p - w'pw'w)
simplifying
(w'w) = 1
gives
= ||y'X|| (w'p - w'p)
= 0
OSC method, Fearn and Höskuldsson
Fearn [39], Höskuldsson [41] and, previously,
Rao et al.[130] created an OSC filter that maximizes
the length of tosc= Xwosc,
with ||wosc||=1. The
wosc vector is found
by first orthogonalizing X to w, E=X-tw'.
Then PCA is used to find the largest principal component
in E. The resulting loading vector is wosc
and gives the largest Y-orthogonal score vector tosc
= Xwosc ,||wosc||
= 1. Note that maximizing tosc
is not equivalent to removing the largest Y-orthogonal
component (toscposc').
Conclusion
Structured noise (Y-orthogonal variation)
in X causes problems for projection based methods
such as PLS, Principal Component Regresssion and other
methods with similar properties . What happens is
that the Y-orthogonal variation is incorporated into
the first PLS score vector t, and adversely
affects the correlation between t and Y
and thus impacts on interpretation.
As described earlier, OSC filters are
useful to remove strong structured Y-orthogonal variation
in X. However, none of the OSC methods mentioned
are optimal for regression because it is not necessary
to remove all Y-orthogonal variation
in X. For PLS, only the Y-orthogonal
variation in X that is included in the PLS
score vector t needs to be removed. Otherwise,
little has been gained, except that an additional
Y-orthogonal component has been calculated. This will
actually increase the total number of components compared
with the unfiltered PLS model. It may also reduce
the predictions due to overfit. Note that one OSC
component can represent the regression coefficient
vector for a multi-component regression model and
should, therefore, be considered as several components.
Here comes the promised cliffhanger...
So, how does one go about developing such an OSC filter
that only removes the structured Y-orthogonal
(if any) that negatively affects the PLS model? Sorry,
you'll have to wait until next month's editorial.
The answer will actually be an integrated OSC + PLS
method = O-PLS method [X1,X2].
REFERENCES
38. Wold S, Antti H, Lindgren F, Ohman
J. Orthogonal signal correction of near-infrared spectra.
Chemometrics Intell. Lab. Syst., 1998; 44: 175-185.
40. Sjöblom J, Svensson O, Josefson M, Kullberg
H, Wold S. An evaluation of orthogonal signal correction
applied to calibration transfer of near infrared spectra.
Chemometrics Intell. Lab. Syst., 1998; 44: 229-244.
41. Höskuldsson A. Variable and subset selection
in PLS regression. Chemometrics Intell. Lab. Syst.,
2001; 55: 23-38.
42. Westerhuis J A, de Jong S, Smilde A K. Direct
orthogonal signal correction. Chemometrics Intell.
Lab. Syst., 2001; 56: 13-25.
128. Svensson O, Kourti T, MacGregor J F. An investigation
of orthogonal signal correction algorithms and their
characteristics. J. Chemometr., 2002; 16: 176-188.
129. Goicoechea H C, Olivieri A C. A comparison of
orthogonal signal correction and net analyte preprocessing
methods. Theoretical and experimental study. Chemometr.
Intell. Lab.Sys., 2001; 56: 73-81.
130. Rao C R. The use and interpretation of principal
component analysis in applied research. Sank-hya A,
1964; 26: 329-358.
X1. Trygg
J, Wold S. Orthogonalized projections to latent structures,
O-PLS. J. Chemometr., 2002; 16: 119-128.
X2.
Trygg J. Parsimonious multivariate models. PhD thesis,
Umeå University: 2001;
|
|
Discussion
Forum
|
Publications
|
Chemometrician's
Addressbook
|
Tutorials
|
|
Keep
yourself fully informed about what topics
are hot or not in chemometrics by joining
some of the discussion groups that exist.
The Discussion
Forum provided on this website has been
active since 1998, be sure to look in the
archive with over 900 postings.
|
Our
searchable
database gives you the complete listing
of the more than 500 publications from Prof.
Svante Wold's research group at Umeå
University, Sweden. It was last updated
July 2003. In due time, it will be updated
with 2004-2005 publications.
|
Having
problems finding your colleagues?
Check
out the HoC's
addressbook, with more than 240 entries,
to see if you find him/her there.
You're
not listed or changed address? No problem,
just fill out this
form and we will add you asap.
|
New
to Chemometrics?
Start your way to wisdom by visiting our
tutorials page
or check out our monthly editorials.
|
|
|
| More
featured pages |
|
|
Conferences |
|
|
Discussion |
|
|
General
links |
|
|
Tutorials
|
|
|
|
Subscribe
to chemometrics Editorial
|
| Sign up to be automatically
notified by email when the next Editorial is out!
|
|
The
NIPALS mountain, revisited
|
|
Some
doubts have been expressed about the real existence
of the NIPALS mountain. To lay these doubts to rest,
a second expedition was organized recently in the
form of the two gentlemen Fredrik Ostman and Jonas
Utterstrom from Umetrics AB in Umea. The expedition
was deployed to verify the existence of NIPALS (the
mountain) at the geographical coordinates supplied
by the Swedish Cartographic Govt Service, and also
take additional corroborating pictures of the same
mountain from other angles than before.
The successful results linked for display below.
|
Picture 1 |
Picture 2 |

NIPALS mountain in Sweden 2005.
Best
regards,
Svante
Wold, Umeå University & Umetrics AB
|
|
SSC9, Reykjavik, Iceland
22-25 August 2005
|
The
9th Scandinavian Symposium on Chemometrics (SSC9)
took place in beautiful Reykjavik, Iceland on August
21-25 2005.
The
organisers, headed by Agnar Høskuldsson and
Margrét Þorsteinsdóttir produced
a scientific programme with very high scientific
level mixed with enjoyable social programme.The
conference website is http://www.conference.is/ssc9
|
|
|
Homepage of Chemometrics
website reaches new record levels in the number
of visits with an increase of almost 700% compared
to 1999 (see Figure below). Many thanks to all of
you who have contributed and made it the most popular
non-commercial chemometrics website with one goal
only, to spread the word of chemometrics and make
it more available to the general scientific community.
/HoC Webmaster

Click on Figure to enlarge
|
|
Webpick
of the month
|
|
Earlier
Picks in 2002-2003:
March/April/May/June/July/Aug./Sep./Oct./Nov/
Dec/Feb03/Mar03/Apr03/May03/July03/Aug03/
|
|
Some
of our feedback over the years ...
|
|
|
Results
from polls 2002/2003
|
|
|
|
Relax
and play
|
|
Play the classical Asteroids game (java). [play
the game]
|
|
About
this site
|
Issue:
January Year: 2006
Editor: Johan
Trygg © 1996-2006 |
| |
|