CALIBRACIÓ LINEAL I COMPARACIÓ DE MÈTODES ANALÍTICS MITJANÇANT … · 2011. 7. 20. · 4....

Departament de Química Analítica i Química Orgànica

Àrea de Química Analítica

CALIBRACIÓ LINEAL I COMPARACIÓ DE MÈTODES

ANALÍTICS MITJANÇANT TÈCNIQUES DE

REGRESSIÓ QUE INCORPOREN ERRORS

EN TOTES LES VARIABLES

Memòria presentada per

ÀNGEL MARTÍNEZ BARAMBIO

per assolir el grau de

Doctor en Química

Tarragona, 2001

Després de quasi quatre anys i mig de treball on tanta i tanta gent m’ha ajudat, no només en l’aspecte científic sinó també en el personal, sento una gran necessitat d’agraïment. Sóc conscient de la responsabilitat que implica mencionar-vos a tots, i això explica la meva por a descuidar-me ni que tan sols sigui a un de vosaltres. Crec més convenient no posar noms perquè estic segur que no només aquells que hagueu tingut alguna cosa a veure amb aquesta tesi, sinó també aquells que alguna vegada hagueu pensat en mi, sabreu llegir el vostre nom en aquestes línies.

Sapigueu que us portaré sempre en el meu pensament, perquè heu estat vosaltres els que m’heu donat forces per tirar endavant i poder acabar aquesta tesi. Per això, us vull dir d’una forma tan senzilla com sincera,

... gràcies per ser-hi.

A la Mª Eugènia, perquè malgrat les dificultats,

sempre m’has donat el teu suport i amor.

Índex

XI

ÍNDEX

Capítol 1. Introducció.

1.1 Objectiu de la tesi doctoral 3

1.2 Estructura de la tesi doctoral 4

1.3 Notació 6

1.4 Regressió lineal 10

1.4.1 Tècniques de regressió lineal univariant 11

1.4.1.1 Tècniques de regressió que consideren els errors en un eix 14

1.4.1.2 Tècniques de regressió que consideren els errors en els dos eixos 22

1.4.2 Regressió lineal multivariant 33

1.4.2.1 Regressió lineal múltiple 35

1.4.2.2 Mínims quadrats multivariants 37

1.5 Tests d’hipòtesi sobre els coeficients de regressió 40

1.5.1 Condicions d’aplicació 41

1.5.2 Importància de la falta d’ajust 42

1.5.3 Probabilitats d’error de primera i segona espècie 43

1.6 Aplicació de la regressió lineal considerant errors en tots els

eixos 45

1.6.1 Calibració de mètodes analítics 45

1.6.2 Comparació de mètodes analítics 46

1.6.3 Predicció 46

1.7 Referències 46

Capítol 2 Falta d’ajust dels punts experimentals a la recta de regressió que

considera errors en els dos eixos.

2.1 Objectiu del capítol –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 53

2.2 Possibles aproximacions per la detecció de falta d’ajust ––––––––––––– 54

2.3 Lack of fit in linear regression considering errors in both axes

(Chemometrics and Intelligent Laboratory Systems, 54 (2000) 61-73) 63

2.4 Conclusions ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 91

Índex

XII

2.5 Referències –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 92

Capítol 3. Probabilitat d’error de primera i segona espècie en els tests individuals

sobre l’ordenada a l’origen i el pendent en regressió lineal considerant errors en els

dos eixos.

3.1 Objectiu del capítol –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-– 95

3.2 Estimació de la probabilitat d’error de segona espècie en

l’aplicació de tests individuals sobre els coeficients de regressió 96

3.3 Relació entre les probabilitats d’error de primera i segona

espècie amb el nombre de mostres de calibració 100

3.4 Detecting proportional and constant bias in method

comparison studies by using linear regression with errors in both

axes (Chemometrics and Intelligent Laboratory Systems, 49 (1999) 179-

193) 104

3.5 Conclusions ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 136

3.6 Referències –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 136

Capítol 4. Detecció del biaix en mètodes analítics per la determinació de múltiples

analits simultàniament. Probabilitat de cometre un error de tipus β.

4.1 Objectiu del capítol –––––––––––––––––––––––––––––––––––––––––––––-––––––––––––––– 139

4.2 Comparació de mètodes analítics ––––––––––––––––––––––––-––––––––––––––––– 140

4.2.1 Determinació de diversos analits simultàniament –––––––––––––––––––––– 143

4.3 Validation of bias in multianalyte determination methods.

Application to RP-HPLC derivatizing methodologies (Analytica

Chimica Acta 406 (2000) 257-278) 151

4.4 Probabilitat d’error β en el test conjunt ––––––––––––––––––––––––––-–––––––– 176

4.5 Evaluating bias in method comparison studies using linear

regression with errors in both axes (Journal of Chemometrics,

acceptat) 178

4.6 Conclusions –––––––––––––––––––––––––––––––––––––––––––––––––––––––-––––––––––––––– 213

4.7 Referències ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-––––––– 214

Índex

XIII

Capítol 5. Comparació de múltiples mètodes mitjançant l’anàlisi per components

principals de màxima versemblança considerant els errors en tots els eixos.

5.1 Objectiu del capítol –––––––––––––––––––––––––––––––––––––––-––––––––––––––––––––– 217

5.2 Anàlisi per components principals de màxima versemblança

(MLPCA) 219

5.3 Validació de l’aproximació per comparar múltiples mètodes –––– 225

5.4 Multiple analytical method comparison by using MLPCA and

linear regression with errors in both axes (Analytica Chimica Acta,

enviat) 230

5.5 Conclusions –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 259

5.6 Referències ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 260

Capítol 6. Habilitat de predicció utilitzant regressió lineal multivariant

considerant errors en tots els eixos en PCR i PCR de màxima versemblança

(MLPCR).

6.1 Objectiu del capítol –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 265

6.2 Tècniques de calibració multivariant de màxima versemblança –- 266

6.2.1 MLPCR –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-–––––––– 267

6.2.2 MLLRR –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-–––––––––– 268

6.3 Errors de predicció –––––––––––––––––––––––––––––––––––––––––––-––––––––––––––––––––––– 270

6.4 Application of multivariate least squares regression method to

PCR and maximum likelihood PCR techniques (Journal of

Chemometrics, enviat) 272

6.5 Conclusions –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-––– 294

6.6 Referències –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 295

Capítol 7. Conclusions.

7.1 Conclusions generals ––––––––––––––––––––––––––––––––––––––-––––––––––––––––––– 299

7.2 Línies de recerca futura ––––––––––––––––––––––––––––––––––––––––––––-–––––––––– 307

CAPÍTOL 1

Introducció

1.1 Objectiu de la tesi doctoral

3

1.1 Objectiu de la tesi doctoral

Aquesta tesi doctoral pretén aprofundir en diferents aspectes de la

regressió lineal considerant els errors en tots els eixos, emprada en el camp

de la química tant en la comparació de mètodes analítics com en la

calibració. Aquests mètodes de regressió consideren, per una banda, les

incerteses degudes als errors comesos en l’anàlisi d’una sèrie de mostres

per cadascun dels diversos mètodes analítics en processos de comparació

de mètodes i per l’altra, les incerteses degudes a tots els valors

experimentals en calibració. D’entre els aspectes considerats se’n poden

diferenciar els següents:

1. Revisió crítica de les tècniques de regressió lineal emprades per estimar-

ne els coeficients.

2. Desenvolupament i validació d’un test estadístic per detectar la falta

d’ajust dels resultats experimentals a la recta de regressió.

3. Desenvolupament i validació d’expressions matemàtiques per estimar les

probabilitats de cometre errors de primera i segona espècie en l’aplicació de

tests individuals sobre els coeficients de regressió.

4. Estudi de la detecció d’un biaix significatiu en els resultats de mètodes

analítics capaços d’analitzar diferents analits alhora mitjançant regressió

lineal.

5. Desenvolupament i validació d’una tècnica per la comparació dels

resultats de múltiples mètodes d’anàlisi que consideri les incerteses dels

resultats analítics.

Capítol 1. Introducció

4

6. Estudi sobre la millora de l’habilitat de predicció en mètodes de

calibració multivariant mitjançant una tècnica de regressió multivariant que

considera les incerteses en tots els valors experimentals.

7. Generació d’algorismes informàtics per facilitar l’aplicació pràctica dels

tests desenvolupats.

1.2 Estructura de la tesi doctoral

La memòria d’aquesta tesi doctoral es troba estructurada en set

capítols que a la vegada es divideixen en diversos apartats i subapartats. En

el primer capítol es fa un recull de les aproximacions més emprades per

estimar els coeficients de regressió tant en el cas de la calibració univariant

com multivariant. Tanmateix es presenta el mètode de regressió de mínims

quadrats bivariants (bivariate least squares, BLS), que és la base del treball

desenvolupat en els capítols següents. Finalment, s’estableixen les

condicions necessàries per l’aplicació correcta de tests estadístics sobre els

coeficients de regressió BLS, així com diverses aplicacions d’aquesta tècnica

de regressió.

Arran de les conseqüències que es poden derivar de l’existència de

falta d’ajust dels punts experimentals a la recta de regressió, en el segon

capítol es presenten i discuteixen dos possibles tests estadístics per la seva

detecció sota les condicions de regressió pròpies del mètode BLS. Les

conclusions extretes d’aquest capítol serviran de base per discutir la relació

entre els errors de primera i segona espècie en els tests individuals sobre els

coeficients de regressió amb el nombre de punts emprats en la construcció

de la recta de regressió BLS, que es tracta en el tercer capítol.

La rellevància de la falta d’ajust i la relació amb l’estimació de l’error

de segona espècie es torna a posar de manifest en el quart capítol, però en

1.2 Estructura de la tesi doctoral

5

aquesta ocasió en l’aplicació del test de confiança conjunta sobre els

coeficients de regressió BLS, per a la comparació de dos mètodes analítics.

Les conclusions extretes dels exemples pràctics d’aplicació del test de

confiança conjunta del capítol quart justifiquen la importància del

desenvolupament d’un procediment de càlcul que permeti estimar les

probabilitats de cometre un error de segona espècie en l’aplicació del test

de confiança conjunta sobre els coeficients de regressió BLS, que també es

presenta en aquest quart capítol.

Després de dedicar els capítols 2, 3 i 4 al desenvolupament i

l’aplicació pràctica d’una sèrie de tests estadístics aplicables als coeficients

de regressió estimats pel mètode BLS, en el capítol cinquè s'entra en el

camp multivariant i es passa a exposar un mètode per la comparació dels

resultats obtinguts per més de dos mètodes analítics considerant les

incerteses de tots els resultats individuals. Aquest mètode es basa per una

part en un mètode de calibració multivariant, l’anàlisi per components

principals de màxima versemblança (maximum likelihood principal component

analysis, MLPCA), i per l’altra en l’aplicació del test de confiança conjunta

sobre els coeficients de la recta de regressió BLS.

El sisè capítol es dedica a l’estudi d’un mètode de regressió

multivariant que considera les incerteses degudes als errors en les mesures

de les diferents mostres i la seva aplicació en tècniques de calibració

multivariant. Això possibilita diferenciar entre l’error de predicció observat

i el vertader, així com la discussió de la millora observada en l’habilitat de

predicció vertadera. Finalment, les conclusions generals extretes d’aquesta

tesi i les possibles línies de recerca futura es presenten en el setè capítol.


6

1.3 Notació

La notació que es detalla a continuació és la seguida en el text escrit

en català de la memòria d’aquesta tesi doctoral, ja que els articles presentats

en cadascun dels capítols tenen una notació específica definida en l’apartat

Notation, que en alguns casos difereix lleugerament de la que es defineix tot

seguit.

Les matrius es representen en majúscula i negreta (p. e. R), els vectors

en minúscula i negreta (p. e. y) i els escalars en cursiva (p. e. xi).

Símbols que comencen amb una lletra de l’alfabet llatí

b0 Valor estimat de l’ordenada a l’origen de la recta de regressió.

0H0b Valor teòric de l’ordenada a l’origen per al qual es postula la hipòtesi

nul·la.

1H0b Valor teòric de l’ordenada a l’origen per al qual es postula la hipòtesi

alternativa.

b1 Valor estimat de l’ordenada a l’origen de la recta de regressió.

0H1b Valor teòric del pendent per al qual es postula la hipòtesi nul·la.

1H1b Valor teòric del pendent per al qual es postula la hipòtesi alternativa.

bp Valor estimat del pendent p de l’hiperplà de regressió.

b Vector amb els valors estimats dels coeficients de regressió.

ei Error residual en el punt i.

21,, υυαF Valor de la distribució F de Fischer per a un nivell de significança α

(1 cua) amb ν1 i ν2 graus de llibertat.

21,,2 υυαF Valor de la distribució F de Fischer per a un nivell de significança α

(2 cues) amb ν1 i ν2 graus de llibertat.

L Funció de versemblança.

1.3 Notació

7

n Nombre de punts experimentals.

pi Nombre de repeticions fetes en la mesura de la mostra i. R Matriu de mesures espectroscòpiques.

s2 Estimació de l’error residual mitjà al quadrat. Error experimental. S Suma de residuals (ponderats o no) al quadrat.

0bs Estimació de la desviació estàndard del pendent de la recta de

regressió.

0H0bs Desviació estàndard del valor teòric de l’ordenada a l’origen per al

qual es postula la hipòtesi nul·la.

1H0bs Desviació estàndard del valor teòric de l’ordenada a l’origen per al

qual es postula la hipòtesi alternativa.

1bs Estimació de la desviació estàndard de la ordenada a l’origen de la

recta de regressió.

0H1bs Desviació estàndard del valor teòric del pendent per al qual es

postula la hipòtesi nul·la.

1H1bs Desviació estàndard del valor teòric del pendent per al qual es

postula la hipòtesi alternativa. 2ies Estimació de la variància de l’error residual en el punt i. Factor de

ponderació 2xs Estimació de la variància de les mesures experimentals de la variable

predictora. 2ixs Estimació de la variància de la variable predictora en el punt i.

2

ikxs Estimació de la variància de la variable predictora k en el punt i.

sxx Suma del quadrat de les distàncies entre cadascuna de les mesures de

la variable predictora i el valor mitjà.

sxy Suma del producte de les distàncies entre cadascuna de les mesures

de les dues variables resposta i els respectius valors mitjans.


8

2iys Estimació de la variància de la variable resposta en el punt i.

syy Suma del quadrat de les distàncies entre cadascuna de les mesures de

la variable resposta i el valor mitjà.

T Matriu de valors propis o scores.

tα,ν Valor de la distribució t de Student per a un nivell de significança α (1

cua) i ν graus de llibertat.

t α/2,ν Valor de la distribució t de Student per a un nivell de significança α (2

cues) i ν graus de llibertat.

V Matriu de vectors propis o loadings.

X Matriu amb els valors mesurats de la/les variable(s) predictores.

x Variable predictora. x Valor mitjà de les mesures experimentals de la variable predictora.

xi Valor mesurat de la variable predictora en el punt i.

ix Valor predit de la variable predictora en el punt i.

ikx Valor mesurat de la variable predictora k en el punt i.

ikx Valor predit de la variable predictora k en el punt i.

px Coordenada x del centroide ponderat.

y Variable resposta.

y Vector de mesures experimentals de la variable resposta. y Valor mitjà de les mesures experimentals de la variable resposta.

yi Valor mesurat de la variable resposta en el punt i.

ijy Valor mesurat de la rèplica j en el punt i.

iy Valor predit de la variable resposta en el punt i.

py Coordenada y del centroide ponderat.

1.3 Notació

9

Símbols que comencen amb una lletra de l’alfabet grec

α Nivell de significança, probabilitat d’error de primera espècie o error

de tipus I.

β Probabilitat d’error de segona espècie o error de tipus II.

β0 Valor vertader del pendent de la recta de regressió.

β1 Valor vertader de l’ordenada a l’origen de la recta de regressió. 2

2,1 −− nαχ Valor de la distribució χ2 per a un nivell de significança α amb n-2

graus de llibertat.

∆ Biaix, màxima diferència acceptable entre un valor estimat i un de

referència.

δi Error aleatori comès en la mesura de la variable predictora en el punt

i.

εi Error residual vertader en el punt i.

γi Error aleatori comès en la mesura de la variable resposta en el punt i. κξ Factor de fiabilitat.

λ Relació entre els errors de les variables resposta i predictora en CVR.

Σ Matriu diagonal de variàncies en l’espai definit per les files.

σ2 Variància vertadera de les mesures experimentals. 2iεσ Variància de l’error residual en el punt i.

2xσ Variància vertadera de totes les mesures experimentals de la variable

predictora. 2ξσ Variància del valor vertader de la variable predictora.

2ixσ Variància vertadera de les mesures experimentals de la variable

predictora en el punt i. 2yσ Variància vertadera de totes les mesures experimentals de la variable

resposta.


10

2iyσ Variància vertadera de les mesures experimentals de la variable

resposta en el punt i. η Variable resposta vertadera.

ηi Valor vertader de la variable resposta en el punt i.

Ψ Matriu diagonal de variàncies en l’espai definit per les columnes.

ξ Variable predictora vertadera.

ξi Valor vertader de la variable predictora en el punt i. 1.4 Regressió lineal

De forma general, el terme regressió lineal comprèn un conjunt de

tècniques estadístiques emprades per identificar les relacions existents

entre dues o més variables (regressió lineal univariant o multivariant

respectivament).1 Per fer-nos una idea de la seva antiguitat, s’ha de dir que

el terme regressió va ser introduït per primera vegada per l’antropòleg i

metròleg britànic Sir Francis Galton (1822-1911) el 1885.2 D’entre els

diferents tipus de mètodes de regressió lineal tant univariant com

multivariant, ens centrarem en els anomenats mètodes no esbiaixats, és a

dir, aquells que no introdueixen un biaix en els coeficients de regressió

estimats.1

Pel que fa l’ús de la regressió lineal univariant en el camp de la

química analítica, són dos els casos en què s’empra majoritàriament. El més

conegut és possiblement la calibració de mètodes analítics, on es relaciona

una resposta instrumental amb les concentracions conegudes dels patrons

de calibració, basant-se sovint en una llei teòrica que justifiqui aquesta

relació (equació de Lambert-Beer, equació d’Ilkovich, equació de Nernst,

etc.). Això permet la posterior predicció de la concentració de mostres

desconegudes a partir d’una mesura instrumental. El segon ús de la

regressió lineal, tot i que menys estès, té igualment una gran importància, ja

1.4 Regressió lineal

11

que permet comparar els resultats d’un nou mètode analític amb els d’un

mètode de referència ja establert.3

D’altra banda, la regressió lineal multivariant permet establir una

relació entre una resposta instrumental amb més d’una variable predictora.

En cas que els valors de les variables predictores siguin linearment

dependents, és a dir col·linears,4,5 l’estimació dels coeficients de regressió

no serà possible. Per solucionar aquest problema, es van desenvolupar

altres tècniques de calibració multivariant, que tot i ser esbiaixades,1 són

capaces d’establir el model de calibració quan els valors de les variables

predictores són altament col·linears. Aquestes tècniques es coneixen, entre

d’altres, amb el nom de regressió per components principals (principal

component regression, PCR)6 i mínims quadrats parcials (partial least squares,

PLS).4 En aquests casos, la regressió lineal multivariant juga un paper molt

important, ja que permet establir un model matemàtic que possibilita la

predicció d’unes propietats determinades en mostres desconegudes.

1.4.1 Tècniques de regressió lineal univariant

Dedicarem aquest apartat a descriure els mètodes que hem considerat

més rellevants d’entre la gran quantitat d’aproximacions existents, per

estimar els coeficients de la recta de regressió. Tots aquests mètodes tenen

en comú el fet de considerar que la relació vertadera7-14 existent entre la

variable predictora (ξ) i la variable resposta (η), obeeix l’equació d’una línia

recta expressada com:

ii ξββη 10 += (1.1)

on les variables β0 i β1 són els coeficients de regressió de la línia recta

vertadera, però desconeguda. Tant en calibració lineal com en comparació

de mètodes, la mesura de les variables predictora i resposta està afectada en


12

major o menor grau per errors experimentals. Això fa que els valors

experimentals mesurats de les variables predictora (x) i resposta (y) siguin

diferents dels valors vertaders. La relació existent entre els valors vertaders

i els mesurats pot expressar-se com:

iiix δξ += (1.2)

iiiy γη += (1.3)

Els errors aleatoris comesos en la mesura de les variables xi i yi són

representats per les variables δi i γi, on ),0(N~ 2ixi σδ i ),0(N~ 2

iyi σγ .8 Si

substituïm les expressions 1.2 i 1.3 en l’equació 1.1, i aïllem la variable yi

s’obté l’expressió següent:7,15,16

iii xy εββ ++= 10 (1.4)

Aquesta és l’equació de la recta de regressió vertadera emprant el valors

mesurats de les variables predictora i resposta. El terme εi és l’error

residual vertader del punt i amb ),0(N~ 2ii εσε 17 i es pot expressar com a

funció de les variables γi, β1 i δi.8

iii δβγε 1−= (1.5)

La figura 1.1 mostra les rectes vertaderes considerant els valors

teòrics (eq. 1.1) i mesurats (eq. 1.4) de les variables predictora i resposta.

També són representats els valors dels residuals vertaders (eq. 1.5) per a

cadascun dels valors yi.


13

1ξ 2ξ 3ξ 4ξ 5ξ 6ξ

1η

2η

3η

4η

5η

6ηiii exbby ++= 10

ii ξββη 10 +=

),( 11 yx

),( 22 yx

),( 33 yx),( 44 yx

),( 55 yx

),( 66 yx

ξ

η

1γ

2γ

3γ4γ

5γ

6γ

1δ

2δ

3δ 4δ

5δ

6δ

1e

2e

3ε4e

5e

5ε

1ε

3e

4ε5ε

2ε

Figura 1.1. Representació de diferents variables en regressió lineal univariant.

La distribució normal bivariant18 que segueixen conjuntament els

errors xi i yi comesos en la mesura de cadascuna de les variables xi i yi (eqs.

1.2 i 1.3) està representada en la figura 1.1 i mostra la densitat de

probabilitat associada als punts de calibració d’observar un determinat

valor experimental. Com es pot veure, s’ha representat la possibilitat més

general, em què la variància de les variables δi i γi és diferent en tots els

punts (heteroscedasticitat). Aquesta condició d’heteroscedasticitat només és

assumida per algun dels mètodes descrits a l’apartat 1.4.1.2. No obstant

això, l’objectiu final de tots els mètodes de regressió lineal és trobar

estimacions dels coeficients de regressió vertaders β0 i β1 que facin que la

recta de regressió presentada a l’expressió 1.6 s'ajusti el millor possible als n

punts experimentals (xi ,yi) seguint un criteri determinat.

iii exbby ++= 10 (1.6)

El terme ei és l’error residual observat pel punt i (xi,yi) amb ),0(N~ 2sei ,8

on s2 és la seva variància anomenada error experimental.


14

D’entre els mètodes de regressió que descriurem a continuació, se’n

poden diferenciar dos grans grups que fan diferents consideracions sobre

l’existència i l’estructura de les variàncies generades pels errors en les

mesures de les mostres. Per una banda, tenim els mètodes de regressió que

només consideren les variàncies en un eix, descrits en les seves diferents

variants en la secció 1.4.1.1. Per una altra, a la secció 1.4.1.2 es presenten

altres tècniques de regressió que consideren les variàncies en ambdós eixos,

tant si estan basades en l’estimació per màxima versemblança com en

l’estimació per mínims quadrats.

1.4.1.1 Tècniques de regressió que consideren els errors en un eix

En aquesta secció descriurem tres dels mètodes més emprats en

regressió lineal: mínims quadrats ordinaris (ordinary least squares, OLS),

mínims quadrats ponderats (weighted least squares, WLS) i mínims quadrats

generalitzats (generalized least squares, GLS). Com es pot comprovar en els

apartats següents, algunes de les expressions per estimar els valors dels

coeficients de regressió es donen en notació matricial degut a la seva

simplicitat. Segons aquesta notació, el model lineal de l’equació 1.6 també

es pot expressar com:1

eXby += (1.7)

=y

n

1

X

n

2

2

1

b+ e

n

1

on el vector y de dimensions n×1 conté els valors de la variable resposta i la

matriu X de dimensions n×2 està formada per una primera columna de


15

valors unitat i una segona amb els valors de la variable predictora. El vector

b de dimensions 2×1 representa els dos coeficients de regressió i e és un

vector n×1 amb els valors dels errors residuals de la variable resposta.

♦ Mínims quadrats ordinaris (OLS)

D’entre tots els mètodes de regressió lineal desenvolupats, el més

conegut i utilitzat és el de mínims quadrats ordinaris (OLS). Tot i que el seu

descobriment se sol atribuir a Carl Friedrich Gauss (1777-1855), que el va

usar abans del 1803, la primera referència bibliogràfica és d’Adrien-Marie

Legendere (1752-1833) el 1805. Aquests fets van aixecar al seu moment una

gran controvèrsia sobre qui va ser el primer a descobrir aquest mètode de

regressió.19,20

La recta de regressió OLS troba els coeficients de la línia de regressió

que millor s’ajusta als punts experimentals (xi,yi), seguint un criteri pel qual

es minimitza una funció de les distàncies residuals entre els valors

experimentals de la variable resposta yi i els valors predits iy , obtinguts a

partir de la recta de regressió segons l’expressió:

ii xbby 10ˆ += (1.8)

Així doncs, la distància residual del punt i que apareix a l’equació 1.6

també es pot expressar com:

iii yye ˆ−= (1.9)

Per garantir que la recta de regressió obtinguda per OLS és la que millor

s’ajusta als punts experimentals, la variable que minimitza aquest mètode

de regressió és la suma de residuals al quadrat:


16

( ) ( )∑∑∑===

−−=−==n

iii

n

iii

n

ii xbbyyyeS

1

210

1

2

1

2 ˆ (1.10)

Per tant, les estimacions de l’ordenada a l’origen i el pendent es

troben calculant les derivades parcials de l'equació 1.10 respecte als

mateixos coeficients i igualant a zero:

00

=∂∂bS

(1.11)

01

=∂∂bS

(1.12)

A partir de les equacions 1.11 i 1.12 es troben les expressions de les

estimacions dels coeficients de regressió per mínims quadrats:

∑

∑

=

=

−

−−= n

ii

n

iii

xx

yyxxb

1

2

11

)(

))(( (1.13)

xbyb 10 −= (1.14)

on x i y corresponen als valors mitjans dels valors i la variable resposta

respectivament. La recta de regressió trobada amb aquests coeficients de

regressió passa pel punt ( yx, ), anomenat centroide.

Una variable molt important en regressió lineal univariant, necessària

per estimar les variàncies dels coeficients de regressió, és la variància de

l’error residual ( 2s ) o error experimental. Aquesta variable dóna una idea


17

de la dispersió dels punts experimentals al voltant de la recta de regressió i

obeeix a l’ expressió següent:

( )

2

ˆ

21

2

1

2

2

−

−=

−=

∑∑==

n

yy

n

es

n

iii

n

ii

(1.15)

Considerant la notació matricial introduïda en l’equació 1.7, les estimacions

de l’ordenada a l’origen i el pendent tenen aquesta forma:

yXXXb T1T )( −= (1.16)

on XT és la matriu trasposta de X. D’altra banda, l’error experimental (eq.

1.15) també es pot expressar com:

2

)()( T2

−−−

=n

s XbyXby (1.17)

Si el model lineal és correcte, l’error experimental és una estimació de la

variància vertadera de les mesures experimentals, és a dir, de l’error

experimental vertader 2σ .

A fi que l’aplicació d’aquest mètode sigui correcta i que, per tant, els

coeficients de regressió estimats no estiguin esbiaixats, les dades

experimentals han de complir uns requisits, que implícitament són

assumits per aquest mètode de regressió:1,3,21,22

1. Els valors vertaders de la variable predictora no són aleatoris sinó

fixos (model funcional). L’error comès en la mesura experimental de la

variable resposta, expressat en termes de variància )( 2iys , ha de ser molt


18

més gran que el corresponent valor per la variable predictora )( 2ixs

multiplicat pel quadrat del pendent. Per aquest motiu OLS considera que

els errors comesos en la mesura de la variable predictora són quasi nuls.

0221

22 ≈⇒>>iii xxy sbss (1.18)

2. Les variàncies dels valors de la variable resposta han de ser

constants al llarg de tot l’interval de linealitat (homoscedasticitat) i

mútuament independents. Això equival a dir que els errors residuals

vertaders dels diferents punts (terme εi a l’equació 1.4) no han d’estar

correlacionats i ),0(N~ σε i per tot i.

En el cas que es compleixin les condicions anteriorment descrites, les

estimacions dels coeficients de regressió mitjançant el mètode OLS es

poden considerar de màxima versemblança.1,21 Això significa que la recta

de regressió (eq. 1.6) obtinguda amb aquests coeficients serà la que tingui

una probabilitat màxima de donar prediccions de la variable resposta )ˆ( iy

més semblants al valors vertaders )( iη . Es pot demostrar considerant la

funció de densitat conjunta dels errors residuals vertaders εi, també

coneguda amb el nom de funció de versemblança:

∑

== =

−−

=∏

n

ii

i eenn

n

i

1

2222

2/1

2/2/

12/1 )2(

1)2(

1Lεσ

σε

πσπσ (1.19)

Trobar les estimacions (b0 i b1) dels coeficients vertaders que

maximitzin la probabilitat de trobar una estimació de l’error residual (ei)

igual al valor vertader (εi), equival a maximitzar la funció de versemblança

L. Per fer això la quantitat observable del el terme exponencial ∑=

n

ii

1

2ε , és a


19

dir, ∑=

n

iie

1

2 s’ha de minimitzar. Això implica minimitzar la suma dels

residuals al quadrat S (eq. 1.10), que és el mateix criteri seguit pel mètode

de regressió OLS.

♦ Mínims quadrats ponderats (WLS)

Les condicions d’homoscedasticitat assumides pel mètode OLS es

violen freqüentment en calibració lineal univariant. Sovint algunes de les

respostes instrumentals són menys fiables que d’altres i, per tant, les

variàncies dels errors associats a aquests valors experimentals no són iguals

(heteroscedasticitat).1 Sota aquestes condicions els coeficients de regressió

estimats pel mètode OLS poden ser esbiaixats i cal aplicar el mètode de

regressió de mínims quadrats ponderats (WLS). En aquest mètode es

continua considerant la variable predictora com a lliure d’error ( 0≈iδ ),

però ara l’estimació dels coeficients de regressió es troba minimitzant la

suma de les distàncies ponderades al quadrat, segons l’equació:

( )∑∑

==

−−==

n

i e

iin

i e

i

iis

xbbyseS

12

210

12

2

(1.20)

on el terme 2ies és el factor de ponderació que correspon a la variància de

l’error residual ei (eq. 1.9), que pel mètode de regressió WLS es pot

expressar com:

2

102 )(var

ii yiie sxbbys =−−=

(1.21)

Així doncs, aquest mètode de regressió dóna més importància a aquells

punts on l’error en la mesura de la variable resposta (expressat en termes

de variància) sigui menor, és a dir, aquells valors experimentals més


20

precisos. Anàlogament al mètode de mínims quadrats, l’error residual pel

mètode de WLS s’expressa com:

( )∑

=

−−−

=n

i e

ii

is

xbbyn

s1

2

2102

21

(1.22)

Les estimacions de l’ordenada a l’origen i el pendent es troben, igual que

pel mètode OLS, calculant les derivades parcials de l’equació 1.20 i

igualant-les a zero (eqs. 1.11 i 1.12), de manera que s’obtenen les

expressions següents:

∑

∑

=

=

−

−−

=n

i e

pi

n

i e

pipi

i

i

sxx

syyxx

b

12

2

12

1 )(

))((

(1.23)

pp xbyb 10 −= (1.24)

Les variables px i py són les mitjanes ponderades de les variables

predictora i resposta respectivament.

∑

∑

=

== n

ie

n

iei

p

i

i

s

sxx

1

2

1

2

1 (1.25)

∑

∑

=

== n

ie

n

iei

p

i

i

s

syy

1

2

1

2

1 (1.26)


21

Aquests dos valors defineixen la posició del centroide ponderat, punt per

on passa la recta de regressió trobada pel mètode WLS. Utilitzant la notació

matricial, les estimacions dels coeficients de regressió i de l’error

experimental es podrien obtenir a partir d’aquestes expressions:

yΣXXVXb 1T11T )( −−−= (1.27)

2

)()( 1T2

−−−

=−

ns XbyΣXby

(1.28)

on Σ representa una matriu diagonal de dimensions n×n en què l’element i de la diagonal és la variància del corresponent valor de la variable resposta

( 2iys ). Cal destacar que si les variàncies dels valors de la variable resposta

són constants, les estimacions dels coeficients de regressió obtingudes de

les expressions 1.23, 1.24 i 1.27 seran iguals a les generades pel mètode de

mínims quadrats ordinaris (OLS, eqs. 1.13, 1.14 i 1.16).

♦ Mínims quadrats generalitzats (GLS)

Aquest és un mètode de regressió que igual que el mètode WLS

s’aplica quan les variàncies dels valors de la variable resposta ( 2iys ) són

heteroscedàstiques. A diferència del mètode WLS, el mètode GLS5,23 té en

compte la possibilitat de correlació entre els valors experimentals de la

variable resposta (covariància). Les expressions matricials per trobar les

estimacions dels coeficients de regressió i de l’error experimental són les

mateixes que les presentades per WLS (eqs. 1.27 i 1.28). En aquest cas, la

matriu Σ ja no és diagonal, sinó que els elements sik (i≠k, 1<i<n, 1<k<n)

corresponen a les covariàncies entre els valors yi i yk de la variable resposta

(cov(yi,yk)).


22

En cas que les covariàncies entre els diferents valors de la variable

resposta siguin nuls, les estimacions dels coeficients de regressió i de l’error

experimental obtingudes per GLS són idèntics als obtinguts amb el mètode

WLS (equacions 1.27 i 1.28).

1.4.1.2 Tècniques de regressió que consideren els errors en els dos eixos

Dins del camp de la química analítica hi ha casos en què l’assumpció

que fan els mètodes de regressió descrits en l’apartat 1.4.1.1 respecte a la

no-existència d’error en la mesura experimental de la variable predictora

no és justificable. Aquest fet és degut en alguns casos a la constant millora

de la precisió en els resultats obtinguts mitjançant instruments d’anàlisi

química, com l’absorció o emissió atòmica,24 que fa que la variància deguda

als errors en les mesures no pugui ser, en molts casos, menyspreable

respecte a la variància generada pels errors aleatoris comesos en la

preparació dels patrons de calibració. Un altre exemple es pot trobar en

l’aplicació de la tècnica de fluorescència per raigs X en mostres

geològiques.25 En aquest cas, la complexitat de les mostres reals fa que els

patrons de calibració se substitueixin per materials de referència certificats.

Els errors comesos en la mesura de les concentracions d’aquests materials

comporten que les variàncies associades a aquests valors siguin

comparables a les generades en la mesura instrumental. Aquesta

problemàtica també es posa de manifest en aquelles tècniques analítiques

relacionades amb la datació per radiocarboni,26,27 on els valors dels patrons

de calibració presenten variàncies degudes a la inestabilitat en el temps

d’aquests materials. D’altra banda, un altre dels camps de la química

analítica on les variàncies degudes als errors comesos en la mesura de les

variables predictora i resposta són similars, és en la comparació dels

resultats de dos mètodes analítics.3

Si el mètode de mínims quadrats s’aplica en casos com els descrits

anteriorment, el fet de negligir els errors en la variable predictora fa que les


23

estimacions dels coeficients de regressió estiguin afectades per un biaix,22

determinat per una variable anomenada factor de fiabilitat,7,10 que es pot

expressar com:

2

2

xsξ

ξ

σκ = (1.29)

on 2ξσ i 2

xs són les variàncies dels valors vertaders i observats de la

variable predictora respectivament. És per aquest motiu que s’han

desenvolupat una sèrie de tècniques de regressió que troben estimacions

dels coeficients tenint en compte els errors comesos en la mesura de les

variables predictora i resposta.

Es considera que Adcock28,29 fou la primera persona a considerar de

manera seriosa el problema de la regressió lineal quan els errors en la

mesura afecten les variables predictora i resposta. El mètode de regressió

que va desenvolupar es coneix avui en dia amb el nom de regressió

ortogonal (orthogonal regression, OR) perquè va assumir que la relació de les

variàncies dels errors eren iguals. Més endavant Kummel30 va generalitzar

el resultat d’Adcock al cas en què la relació de les variàncies dels errors fos

coneguda, desenvolupant així el que avui en dia s’anomena mètode de

relació de variàncies constant (constant variance ratio approach, CVR).

Aquests dos mètodes de regressió han estat redescoberts un gran nombre

de vegades en una gran varietat d’àrees del coneixement, com ara la

quimiometria.31-34 Per aquest motiu el mètode de regressió ortogonal és

conegut per diversos noms com regressió de la distància ortogonal

(orthogonal distance regression, ODR3) o mínims quadrats totals (total least

squares, TLS).35

Tot i que durant el transcurs d’aquest segle s’han desenvolupat un

gran nombre de mètodes de regressió que consideren les variàncies dels

errors en la mesura de les variables predictora i resposta, tots intenten


24

solucionar aquest problema minimitzant les distàncies, ja siguin

perpendiculars o ponderades, dels punts experimentals a la recta de

regressió. A continuació presentem alguns dels mètodes de regressió que

consideren les variàncies dels errors generats en la mesura de les variables

predictora i resposta més utilitzats en química analítica dividits en dos

grans grups: els que troben els coeficients de regressió per un criteri de

màxima versemblança i els que ho fan per mínims quadrats.

♦ Estimació per màxima versemblança

L’estimació per màxima versemblança dels models de regressió lineal

que consideren les variàncies dels errors comesos en la mesura de les

variables predictora i resposta, tracta d’obtenir les estimacions dels

coeficients de regressió (b0 i b1) amb màximes probabilitats de ser iguals (o

de màxima versemblança) als valors vertaders (β0 i β1). D’aquesta manera

els valors predits de la variable resposta ( iy ) seran els que tindran una

màxima probabilitat de ser iguals als valors teòrics però desconeguts ( iη ).

Igual que les altres tècniques de regressió descrites a l’apartat 1.4.1.1,

el model lineal amb errors en les mesures assumeix que les variables ξ i η

estan relacionades per l’equació 1.1. Ara bé, aquests models assumeixen

que aquestes dues variables no són observables i que només es poden

mesurar les variables presents a les equacions 1.2 i 1.3, que estan afectades

per errors aleatoris. D’aquesta manera es poden distingir tres tipus de

models amb errors en les mesures:8

- El model funcional, que considera els valors vertaders de la variable

predictora iξ com a constants.


25

- El model estructural, que assumeix iξ com a variables aleatòries

independents i distribuïdes de manera igual.

- El model ultraestructural,8,36 pel qual les variables iξ són com en el

model estructural, però no estan distribuïdes de forma igual i a més poden

tenir diferents mitjanes amb variància comuna.

D’entre aquests tres tipus de models lineals, ens centrarem en el

funcional, ja que és el que millor s’ajusta a les condicions experimentals

sota les quals s’aplica la regressió lineal en l’anàlisi química. Això és així

perquè ja sigui en calibració o en comparació de mètodes analítics, els

valors vertaders de la variable predictora iξ són constants corresponents

als valors desconeguts de la concentració d’analit en cadascuna de les

mostres que s’han d’analitzar.

Dintre de l’estimació per màxima versemblança del model funcional

hi ha diferents casos:8

a) Quan la relació de variàncies 22xy σσλ = és coneguda.

b) Quan el factor de fiabilitat ξκ (eq. 1.29) és conegut.

c) Quan la variància de l’error comès en mesurar la variable

predictora ( xσ ) és conegut.

d) Quan la variància de l’error comès en mesurar la variable resposta

( yσ ) és conegut.

e) Quan les dues variàncies dels errors comesos en mesurar les

variables predictora i resposta ( xσ i yσ ) són conegudes.

f) Quan el valor vertader de l’ordenada a l’origen (β0) és conegut.

D’entre totes aquestes possibilitats, ens centrarem en els casos a) i e),

ja que corresponen a dos dels mètodes de regressió més emprats en el camp


26

de la química analítica, coneguts com regressió ortogonal (OR) i regressió

per relació constant de variàncies (CVR). Mentre que el mètode OR només

considera que λ=1, el mètode CVR és més general i considera que la relació

de variàncies λ és constant amb 22xy σσλ = . Per poder trobar les

estimacions dels coeficients de regressió seguint un criteri de màxima

versemblança per a un model funcional, primer caldrà definir la funció de

versemblança L o funció de densitat conjunta dels errors residuals

vertaders εi, que s’ha de maximitzar:8

−−+−

−∝ ∑ ∑

= =

−− n

i

n

iiiii

x

nx

n

yx1 1

210

22

22 )()(2

1expL ξββξσ

σλ (1.30)

En aquesta expressió apareix símbol ‘∝’ (proporcional a), perquè hem omès

la constant de normalització. Per estimar els coeficients de regressió s’ha de

maximitzar la funció L trobant les derivades parcials respecte a les

variables β0, β1, 2xσ i nξξ ,...,1 i igualant-les a zero.8 Considerant λ=1

(regressió ortogonal) i tenint en compte que iii x ξδ −= (eq. 1.2) i

iii y ξββγ 10 −−= (eq. 1.4), això és equivalent a minimitzar la part

exponencial de l’equació 1.30, que també es pot expressar com

∑=

+n

iii

1

22 )( γδ . Segons el teorema de Pitàgores això és la suma de les

distàncies ortogonals dels punts experimentals a la recta de regressió

ortogonal. La figura 1.2 representa aquesta situació pel punt experimental

(xi,yi).8


27

ξ

η

ξββη 10 +=

1β

0β

),( ii ηξ

),( ii yx

ix iix δ+

iy

ix10 ββ +

iγ

iδ

Figura 1.2. Distància a minimitzar en regressió ortogonal.

Un cop trobades les expressions que maximitzen L a partir de les

derivades parcials esmentades, s’han d’aïllar les variables corresponents als

coeficients de regressió b0 i b1. Les expressions que s’obtenen són les

següents:33

xy

xyxxyyxxyy

ssssss

b2

4)( 22

1

λλλ +−+−= (1.31)

xbyb 10 −= (1.32)

on

∑=

−=n

iixx xxs

1

2)( (1.33)


28

∑=

−=n

iiyy yys

1

2)( (1.34)

∑=

−−=n

iiixy yyxxs

1))(( (1.35)

Les solucions de màxima versemblança dels coeficients de regressió

pel model funcional en els dos casos tractats (a i e) coincideixen amb les

obtingudes pel model estructural. No obstant això, l’estimació de l’error

experimental s2 segons el principi de màxima versemblança pel model

funcional no és correcta, tot i que aquest problema ha estat solucionat per

Lindley.37 D’altra banda, en la resta de casos (b), c), d) i f)) l’estimació per

màxima versemblança considerant un model funcional no és possible. Així

doncs, es pot concloure de forma general que l’existència i consistència de

les estimacions dels diferents paràmetres de regressió pel model funcional

des d’un punt de vista de màxima versemblança no són garantides. Això és

degut al fet que en el cas d’assumir un model funcional, el nombre de

paràmetres (ξi) augmenta amb la quantitat de mostres de calibració. Com

que aquestes tècniques de regressió requereixen que la variable predictora

estigui modelada per una funció de versemblança,38 no es pot garantir de

forma general l’existència d’estimacions de màxima versemblança

consistents en conjunts de calibrat grans.8 A més, existeixen molts casos en

els que les dades experimentals són altament heteroscedàstiques i les

estimacions de les variàncies dels errors de mesura només es poden obtenir

mitjançant l’anàlisi replicada de les mostres de calibració. En aquests casos

el valor de λ és desconegut i, per tant, l’estimació per màxima

versemblança no és possible per models funcionals. Per aquest motiu cal

cercar un mètode de regressió lineal univariant que permeti trobar les

estimacions dels coeficients de regressió encara que no sigui mitjançant el

principi de màxima versemblança. Segons va demostrar Lindley,37 alguns

mètodes basats en el principi de mínims quadrats donen estimacions


29

idèntiques a les dels mètodes de màxima versemblança quan s’assumeix

una relació de variàncies λ constant. Per aquest motiu hem decidit emprar

el mètode de regressió BLS, que és un mètode de regressió per mínims

quadrats iteratius aplicable a qualsevol conjunt de dades experimentals

sense haver de fer assumpcions sobre la distribució dels valors vertaders de

la variable predictora ξ.

♦ Estimació per mínims quadrats

Una gran varietat de mètodes de regressió lineal univariant es basen

en el principi de mínims quadrats capaços d’estimar els coeficients de

regressió considerant les variàncies heteroscedàstiques dels errors comesos

en la mesura dels diferents valors de les variables predictora i resposta.39-59

D’entre tots aquests mètodes de regressió, es va trobar que el mètode de

mínims quadrats bivariants (bivariate least squares, BLS), desenvolupat per

Lisý60 i col·laboradors, era el més adequat a causa de la senzillesa per

programar el seu algoritme, la rapidesa per estimar els diferents

paràmetres de regressió i la facilitat per obtenir la matriu de variància-

covariància.61

Aquest mètode, igual que els basats en el principi de màxima

versemblança, considera que existeix una relació lineal vertadera entre les

variables ξ i η (eq. 1.1) i que existeixen errors en la mesura d’ambdues

variables segons s’expressa en les equacions 1.2 i 1.3. A diferència d’altres

tècniques de regressió i tenint en compte aquestes assumpcions, el mètode

BLS considera la variància associada a l’error residual observat ei (eq. 1.6)

per trobar les estimacions dels coeficients de regressió. Aquesta variable

s’anomena factor de ponderació ( 2ies ) i té en compte les variàncies dels

errors experimentals ( 2ixs i 2

iys ) comesos en mesurar repetidament les

variables xi i yi en cadascuna de les mostres. La correlació (covariància)


30

entre els valors de les variables xi i yi també es té en compte, tot i que

normalment s’assumeix igual a zero.

)(cov2)var( 1

221

210

2iixyiie yxbsbsxbbys

iii−+=−−=

(1.36)

El mètode de regressió BLS troba les estimacions dels coeficients de

regressió minimitzant la suma dels residuals ponderats al quadrat S segons

l’expressió:

)2(

)ˆ()ˆ()ˆ( 2

12

2

12

2

2

2

−=−

=

−+

−= ∑∑

==

nss

yys

yys

xxSn

i e

iin

i y

ii

x

ii

iii (1.37)

on s2 és l’estimació de l’error experimental. Segons aquesta equació, el

mètode BLS assigna un pes més important als parells de dades amb valors 2

ixs i 2iys més petits, és a dir, aquells valors experimentals que siguin més

precisos i on per tant, l’error en la mesura experimental ha de ser menor.

Minimitzant la suma dels residuals ponderats al quadrat S, s’obtenen dues

equacions no lineals que en notació matricial es poden expressar com:

gDb = (1.38)

∂

∂

+

∂

∂

+

=

×

∑

∑

∑∑

∑∑

=

=

==

==

n

i

e

e

i

e

ii

n

i

e

e

i

e

i

n

i e

in

i e

i

n

i e

in

i e

bs

se

syx

bs

se

sy

bb

sx

sx

sx

s

i

ii

i

ii

ii

ii

1 1

22

22

1 0

22

22

1

0

12

2

12

12

12

21

211

(1.39)

Les estimacions dels coeficients de regressió en el vector b (eq. 1.38) es

calculen amb un procediment iteratiu seguint l’expressió següent:61


31

gDb 1−= (1.40)

Amb aquest mètode, la matriu de variàncies-covariàncies dels coeficients

de regressió s’estima multiplicant la matriu 1−D resultant del procés

iteratiu per l’estimació de l’error experimental s2 (eq. 1.37). Quan les

variàncies dels errors experimentals comesos en la mesura de la variable

predictora són nul·les, l’equació 1.36 queda reduïda a l’equació 1.21, i per

tant, les estimacions dels coeficients de regressió pel mètode BLS són les

mateixes que les obtingudes pel mètode WLS. D’altra banda, en el cas que

les variàncies de tots els errors experimentals comesos en la mesura de la

variable resposta siguin constants i nul·les per totes les variables

predictores, el valor del factor de ponderació 2ies (eq. 1.36) serà constant i

les estimacions dels coeficients de regressió seran iguals a les obtingudes

pel mètode OLS.

D’aquesta manera el mètode de regressió BLS minimitza les

distàncies ( iS ) entre els punts experimentals i la recta que apareixen a la

figura següent.


32

x

y

5.01

=xs

1S

4.02

=xs

1.02

=ys

75.01

=xs

5.01

=ys

2S3S

1)ˆ()ˆ(2

22222

22

=−=−

xx sxx

sxx

1)ˆ()ˆ(2

22222

22

=−=−

yy syy

syy

1.0)ˆ( 22 =− yy

2.0)ˆ( 33 =− yy

4.0)ˆ( 33 =− xx

4.0)ˆ(

3

33 =−

ysyy

53.0)ˆ(

3

33 =−

xsxx

2.04

=ys

1.04

=xs2.0)ˆ( 44 =− xx

085.0)ˆ( 44 =− yy

2)ˆ(

4

44 =−

xsxx

42.0)ˆ(

4

44 =−

ysyy

s y1=

0.3

5

4.0)ˆ( 11 =− yy

4.0)ˆ( 22 =− xx

28.0)ˆ(2

233

3

=−

xsxx

16.0)ˆ(2

233

3

=−

ysyy

18.0)ˆ(2

244

4

=−

ysyy

4)ˆ(2

244

4

=−

xsxx

4S

Figura 1.3. Distàncies que minimitza el mètode de regressió BLS.

Per cada punt individual es compleix 2

2

2

2 )ˆ()ˆ(

ii y

ii

x

iii s

yys

xxS −+

−= . Les línies

verticals i horitzontals en negreta que apareixen en la figura 1.3 centrades

als punts experimentals corresponen a dues vegades als valors de les

desviacions estàndard dels errors comesos en la mesura dels valors

experimentals ixs i

iys . És important destacar que els coeficients de

regressió estimats pel mètode BLS no varien en intercanviar els eixos.

Un punt important a tenir en compte pel mètode de regressió BLS fa

referència a l’estimació de les variàncies dels errors experimentals comesos

en la mesura de les mostres de diferents concentracions ( 2ixs i 2

iys ). Per

obtenir les millors estimacions possibles dels coeficients de regressió quan

no es disposa d’estimacions prèvies de les variàncies dels errors

experimentals, és necessari fer un nombre suficient de rèpliques per cada

una de les mostres. Tot i això, les estimacions de les variàncies dels errors

56.2)ˆ(

2

211

1

=−

xsxx

6.1)ˆ(

1

11 =−

xsxx

8.0)ˆ( 11 =− xx

14.1)ˆ(

1

11 =−

ysyy

30.1)ˆ(

2

211

1

=−

ysyy


33

experimentals poden incloure fonts de variació que no tenen res a veure

amb els errors aleatoris comesos en l’anàlisi de les mostres.62 Aquest pot ser

el cas de rèpliques amb diferents mitjanes, falta d’homogeneïtat en les

mostres (cas de mostres geològiques) o interferències que poden afectar de

diferent manera cada un dels mètodes en comparació. Sota aquestes

circumstàncies, les estimacions dels coeficients de regressió amb el mètode

BLS, així com amb la resta de mètodes de regressió que consideren els

errors comesos en les mesures experimentals, poden ser esbiaixades. El

biaix es produeix perquè aquests mètodes de regressió consideren que la

variabilitat en les rèpliques de les anàlisis de les mostres és únicament

deguda a errors aleatoris.62 D’aquesta manera, s’ignora una font de

variabilitat present quan els valors vertaders de les variables predictora i

resposta (ξi i ηi) no segueixen una relació lineal (existeix un error en

l’equació7, 8) i, per tant, els parells de valors (ξi ,ηi) no s’ajusten perfectament

a una línia recta. No obstant això, l’ús de models lineals amb un error en

l’equació no és freqüent en química analítica, perquè les respostes

instrumentals solen obeir a una llei teòrica (llei de Lambert-Beer, llei de

Nernst, etc.). A més, en la comparació de mètodes analítics, com que els dos

mètodes d’anàlisi mesuren les mateixes mostres, els models lineals amb un

error en l’equació no són estadísticament justificables.63 Tot i això, els

usuaris de mètodes de regressió que consideren les estimacions de les

variàncies dels errors experimentals, han de tenir molt presents les

diferents fonts d’error que poden afectar les mesures experimentals.

1.4.2 Regressió lineal multivariant

En la regressió lineal mulitvariant es relaciona la variable resposta (η)

amb diferents variables predictores (ξj, j=1,...,p) segons l’expressió:

iii ppi ξβξβξββη ++++= ...22110 (1.41)


34

Anàlogament al cas de la regressió lineal univariant, les variables resposta i

predictores vertaderes no es poden obtenir experimentalment a causa dels

errors aleatoris en la mesura. Així, els valors observats per la variable

resposta vénen donats per l’equació 1.3, mentre que per les variables

predictores es poden expressar com:

iii

iii

iii

pppx

x

x

δξ

δξ

δξ

+=

+=

+=

...222

111

(1.42)

Els errors comesos en la mesura de les variables predictores es

distribueixen de forma anàloga a la regressió univariant. Introduint les

expressions 1.3 i 1.42 en l’equació 1.41, s’obté l’equació de la recta de

regressió multivariant tenint en compte el valors mesurats de les variables

predictores i resposta:10

ippi iiixxxy εββββ +++++= ...22110 (1.43)

Aquesta és l’equació de l’hiperplà de regressió vertader emprant el valors

mesurats de les variables predictora i resposta. El terme εi és l’error

residual vertader del punt i amb ),0(N~ 2ii εσε 17 i es pot expressar com a

funció de les variables γi, β1,...,βp i ii pδδ ,...,1 :

iii ppii δβδβδβγε −−−−= ...2211 (1.44)

Igual que en el cas de la regressió lineal univariant, l’objectiu dels mètodes

de regressió lineal multivariant és trobar estimacions dels coeficients de

regressió vertaders que facin que l’hiperplà de regressió (p+1)-dimensional


35

(eq. 1.45) s'ajusti el millor possible als n punts experimentals seguint un

criteri determinat.

ippi exbxbxbbyiii

+++++= ...22110 (1.45)

on el terme ei és l’error residual observat pel punt i ),,...,,( 21 ip yxxxiii

. A

partir dels coeficients de l’hiperplà de regressió, es poden calcular els

valors predits de la variable resposta ( iy ) segons l’expressió:

iii ppi xbxbxbby ++++= ...ˆ 22110 (1.46)

Atesa l’escassa bibliografia sobre regressió lineal multivariant, hem

cregut convenient descriure els dos mètodes utilitzats en el capítol 7.

Aquest dos mètodes de regressió multivariant són anàlegs als mètodes

d’OLS i BLS en la regressió univariant.

1.4.2.1 Regressió lineal múltiple

El mètode de regressió lineal múltiple (multiple linear regression, MLR)

és el més utilitzat en regressió multivariant per trobar els coeficients de

l’hiperplà de regressió. De manera anàloga al mètode OLS en regressió

lineal univariant, els coeficients de regressió estimats pel mètode MLR són

aquells que minimitzen la suma de residuals al quadrat, segons l’expressió:

∑∑∑===

−−−−=−==n

ippi

n

iii

n

ii ii

xbxbbyyyeS1

2110

1

2

1

2 )...()ˆ( (1.47)

Aquest mètode de regressió multivariant té en compte les mateixes

assumpcions que les descrites pel mètode OLS, sota les quals es pot

considerar que els coeficients estimats són el més semblants possibles als

valors teòrics desconeguts, i per tant cal considerar MLR com a un mètode


36

de màxima versemblança. La següent figura mostra les distàncies residuals

entre els punts experimentals i el pla de regressió, minimitzades per MLR

en un espai tridimensional:

2x

ii exbxbbyii+++= 22110

1x

y

5y

5y5e

Figura 1.4. Regressió multivariant pel mètode MLR

A causa de l’increment de variables i, per tant, de coeficients de

regressió que s’han d’estimar, la notació matricial en regressió lineal

multivariant està molt estesa per la simplificació que aporta a les

expressions matemàtiques. En forma matricial l’expressió de l’equació 1.45

és idèntica a la ja vista pel mètode de regressió OLS (eq. 1.7), amb la

diferència que en aquest cas les dimensions de la matriu X i del vector de

regressió b són n×(p+1) i (p+1)×1 respectivament. De la mateixa forma,

l’expressió matemàtica per estimar els coeficients de l’hiperplà de regressió

és igual a la presentada pel mètode OLS (eq. 1.16). En aquest cas, cal

destacar que perquè l’estimació del vector b per mínims quadrats sigui

única, les columnes de la matriu X han de ser independents.4


37

Un dels inconvenients més importants de MLR sorgeix quan els

valors de la variable predictora en la matriu X no són independents entre

si. Aquest fenomen es coneix amb el nom de col·linearitat i és habitual en la

majoria de dades espectroscòpiques (infraroig proper, ultraviolat-visible,

etc.) i provoca que l’estimació del vector b per mínims quadrats no sigui

única, ja que la matriu X no és invertible. Aquest problema s’ha solucionat

amb els mètodes de calibració multivariant basats en la descomposició de

la matriu X en components principals o variables latents. Dos exemples

d’aquest tipus de tècniques són la regressió per components principals

(principal component regression, PCR)6 i els mínims quadrats parcials (partial

least squares, PLS).4 En aquests dos mètodes de calibració multivariant, el

mètode MLR juga un paper fonamental, ja que s’usa per establir el model

de regressió multivariant durant l’etapa de calibració, que permet la

predicció posterior d’unes propietats determinades en mostres

desconegudes.

1.4.2.2 Mínims quadrats multivariants

El mètode de mínims quadrats multivariants (multivariate least

squares, MLS) és un mètode de regressió multivariant anàleg al mètode BLS

en regressió lineal univariant. El mètode MLS també es basa en el treball

desenvolupat per Lisý i col·laboradors60 per estimar els coeficients de

l’hiperplà de regressió tenint en compte les variàncies degudes als errors

comesos en la mesura de les variables predictores ),...,,( 22221 ipii xxx sss i

resposta )( 2iys . De manera similar al mètode univariant BLS, en aquest cas,

els coeficients de regressió estimats són aquells que minimitzen la suma del

quadrat de les distàncies residuals ponderades en un espai tridimensional

(vegeu figura 1.5), segons l’expressió:


38

∑∑==

−=

−+

−+

−=

n

iii

e

n

i xxy

ii yyss

xxs

xxs

yySii

ii

i

ii

i 1

22

12

222

2

211

2

2

)ˆ(1)ˆ()ˆ()ˆ(

21

(1.48)

on 2ies és el factor de ponderació corresponent a la variància del residual ei

(eq. 1.45) del punt i ),,...,,( 21 ip yxxxiii

i iy s’obté a partir de l’equació 1.46.

Anàlogament al factor de ponderació per BLS, té en compte les variàncies

tant de les variables predictores com de la variable resposta, així com la

covariància entre els valors experimentals que se sol assumir igual a zero:

∑ ∑∑∑= +===

+−+=p

k

p

kllklk

p

kkk

p

kxkxe iiiiikii

xxbbxxbsbss2 12

12

2222 ),cov(2),cov(2ˆ1

(1.49)

2x

1x

y

),,( 21 iixxyi

),,ˆ( 21 iixxyi

),ˆ,( 21 iixxyi

)ˆ,,( 21 iixxyi

)ˆ( 22 iixx −

)ˆ( 11 iixx −

)ˆ( ii yy −

5.01

=xs

22

=xs

1=ysiS

Figura 1.5. Distància residual minimitzada pel mètode MLS.


39

L’estimació de l’error experimental s’obté dividint la suma de les distàncies

residuals ponderades S (eq. 1.48) entre el número apropiat de graus de

llibertat.

pn

Ss−

=2 (1.50)

En minimitzar la suma de residuals ponderats al quadrat S respecte als

coeficients de regressió (bk, k=1...p), es generen p equacions no lineals, que

en notació matricial poden ser expressades com:

gbD = (1.51)

∑

∑

∑

∑

∑∑∑∑

∑∑∑∑

∑∑∑∑

∑∑∑∑

=

=

=

=

====

====

====

====

+

+

+

+

=×

n

i p

e

e

i

e

pi

n

i

e

e

i

e

i

n

i

e

e

i

e

i

n

i

e

e

i

e

i

pn

i e

pn

i e

pn

i e

pn

i e

p

n

i e

pn

i e

n

i e

n

i e

n

i e

pn

i e

n

i e

n

i e

n

i e

pn

i e

n

i e

n

i e

bs

se

sxy

bs

se

sxy

bs

se

sxy

bs

se

sy

b

bbb

sx

sxx

sxx

sx

sxx

sx

sxx

sx

sxx

sxx

sx

sx

sx

sx

sx

s

i

ii

i

i

ii

i

i

ii

i

i

ii

i

i

i

ii

i

ii

i

i

i

ii

i

i

i

ii

i

i

i

ii

i

ii

i

i

i

i

i

i

i

i

i

i

i

1

22

22

1 3

22

223

1 2

22

222

1 1

22

22

3

2

1

12

2

12

3

12

2

12

12

3

12

23

12

32

123

12

2

12

23

12

22

122

12

123

122

12

21

21

21

21

1

∂∂

∂∂

∂∂

∂∂

…

…

…

…

(1.52)

El vector b amb els coeficients de regressió es pot estimar a través d’un

procediment iteratiu seguint l’expressió:

gDb 1−= (1.53)

Amb aquest mètode, la matriu de variàncies-covariàncies dels coeficients

de regressió es pot obtenir multiplicant la matriu 1−D final per l’estimació


40

de l’error experimental s2 (eq. 1.50). En el cas que les variàncies de tots els

errors experimentals comesos en la mesura de la variable resposta siguin

constants i nul·les per totes les variables predictores, el valor del factor de

ponderació 2ies (eq. 1.49) serà constant i les estimacions dels coeficients de

regressió seran iguals a les obtingudes pel mètode MLR. S’ha de destacar

que, tal com succeeix en el mètode BLS, el vector de regressió b estimat per

MLS no varia en intercanviar els eixos.

De forma anàloga al mètode de regressió BLS, perquè els coeficients

de l’hiperplà de regressió estimats amb el mètode MLS siguin correctes,

també es necessiten bones estimacions de les variàncies dels errors aleatoris

comesos en les mesures experimentals. Per aquest motiu, els comentaris al

final de la secció 1.4.1.2 referents a l’estimació de les variàncies dels errors

experimentals també són vàlids per al mètode de regressió MLS.

1.5 Tests d’hipòtesi sobre els coeficients de regressió

Mitjançant l’aplicació de tests d’hipòtesi sobre els coeficients de

regressió és possible detectar per un nivell de significança α, errors

sistemàtics significatius en els valors dels coeficients de regressió respecte a

uns valors de referència establerts.3 Per fer això inicialment es postulen

dues hipòtesis:

1. La hipòtesi nul·la (H0) assumeix que els coeficients de regressió

estimats pertanyen a una distribució centrada al voltant d’un valor de

referència, és a dir, que no existeixen diferències significatives entre els

valors dels coeficients de regressió estimats i els de referència per un nivell

de significança α.


41

2. La hipòtesi alternativa (H1) assumeix que els coeficients de

regressió pertanyen a una distribució centrada al voltant d’un valor

esbiaixat i, per tant, que existeixen diferències significatives per un nivell

de significança α entre els valors dels coeficients de regressió estimats i els

de referència postulats per H0.

En cas d’acceptar la hipòtesi alternativa serà necessari revisar el

procediment analític per identificar la font d’error que afecta la mesura dels

valors experimentals. Tot i que l’aplicació d’aquests tests estadístics està

força estesa en calibració lineal, és important tenir en compte que s’han de

complir una sèrie de condicions perquè les conclusions extretes sobre les

hipòtesi formulades tinguin significança estadística.

1.5.1 Condicions d’aplicació

Els procediments de mesura químics, a diferència de les mesures

físiques, solen estar compostos de diverses etapes que normalment són

independents. Per tant, la majoria dels resultats obtinguts per anàlisis

químiques acostumen a incorporar errors provinents de les diferents etapes

del procediment d’anàlisi. Segons el teorema del límit central,7

independentment de la distribució dels errors comesos en la mesura de les

variables x i y en les diverses etapes del procediment d’anàlisi, la suma

d’aquests errors seguirà una distribució normal. Això assegura que els

residuals vertaders iε (eqs. 1.4, 1.5 i 1.44), igual que els coeficients de

regressió, estaran distribuïts normalment.1 Per aquesta raó els tests

d’hipòtesi emprats amb els coeficients de regressió, així com amb la resta

dels paràmetres de regressió, assumeixen la hipòtesi de normalitat.

En els mètodes de regressió que consideren els errors en tots els eixos,

i particularment en el mètode desenvolupat per Lisý i col·laboradors,

Kalantar64 va demostrar que els coeficients de regressió no seguien una


42

distribució normal. Per aquest motiu en el tercer capítol s’estudia el grau de

desviació de la distribució real dels coeficients de regressió BLS respecte a

una distribució normal. En funció del grau de desviació es determinarà si

és raonable l’assumpció de normalitat en els coeficients de regressió BLS

per poder aplicar tests d’hipòtesis.

1.5.2 Importància de la falta d’ajust

En els tests estadístics basats en la regressió lineal és important

assegurar l’adequació dels valors experimentals al model de regressió. Si

els punts experimentals no estan prou ajustats a la recta de regressió, el

model lineal pot no ser vàlid. En aquest cas, l’error experimental s2 estarà

sobreestimat i no donarà una mesura correcta dels errors aleatoris presents

en els valors experimentals.65 Per evitar això, en el mètode de mínims

quadrats, se sol emprar el coeficient de correlació, r (o el seu quadrat, el

coeficient de determinació). Aquest coeficient, però, no és un paràmetre

informatiu de la qualitat de l’ajust dels punts experimentals a la recta, ja

que no es tracta d’un test estadístic.66 Tot i que l’ús de gràfics de residuals

augmenta la fiabilitat en la detecció de la falta d’ajust mitjançant aquest

coeficient, la forma més correcta, tot i que costosa experimentalment, de

detectar la falta d’ajust és mitjançant un test de l’anàlisi de la variància

(ANOVA).1

Una de les conseqüències més clares de la falta d’ajust es manifesta en

l’aplicació de tests d’hipòtesi sobre els coeficients de regressió. La mida dels

intervals de confiança calculats per detectar diferències significatives

respecte als valors teòrics és funció directa de l’error experimental s2. Per

aquest motiu, en cas d’existir falta d’ajust entre els punts experimentals i la

recta de regressió, el valor de s2 estarà sobreestimat i els intervals de

confiança seran més grans.67 Sota aquestes circumstàncies, hi haurà una


43

major probabilitat de confondre els possibles errors sistemàtics comesos en

el procés de mesura de les mostres amb errors aleatoris.

1.5.3 Probabilitats d’error de primera i segona espècie

És conegut que en els tests d’hipòtesi per poder acceptar o rebutjar la

hipòtesi nul·la (H0), s’ha de fixar un nivell de significança que marca la

probabilitat de rebutjar-la quan en realitat és la correcta. Aquest error és

conegut amb el nom d’error de primera espècie o error de tipus α. D’altra

banda, en el cas d’acceptar la hipòtesi alternativa (H1) quan en realitat la

correcta és la hipòtesi nul·la, cometrem un error de segona espècie, també

conegut com a error de tipus β.3 La figura 1.6 mostra la probabilitat de

cometre els errors α i β en el cas d’un test individual sobre l’ordenada de la

recta de regressió.

Biaix (∆)

H b00 0: = H 0+∆1: =?

sb0

Probabilitat d’error β

b0b0

sb0

Probabilitat d’error α Probabilitat

d’error α

Figura 1.6. Probabilitats de cometre errors α i β.

El biaix ∆ és la diferència mínima (fixada per l’usuari) entre els valors

dels coeficients a partir dels quals es postulen H0 i H1, que es vol detectar

com a error sistemàtic. La taula 1.1 esquematitza les situacions en què es

cometen aquests dos tipus d’errors en els tests d’hipòtesi.


44

Conclusió mitjançant el test

H0 certa H0 falsa

H0 certa Correcta Error α Situació real

H0 falsa Error β Correcta

Taula 1.1. Situacions en què es poden cometre errors α i β.

Tot i que tradicionalment s’ha donat més importància a les

probabilitats d’error α, hi ha casos on es necessita assegurar una

probabilitat d’error β baixa. Aquest és el cas de la verificació de la

traçabilitat68 d’un mètode analític, on una probabilitat d’error β elevada

implica que hi ha moltes probabilitats que es pugui afirmar erròniament

que un mètode analític que dóna resultats esbiaixats sigui traçable al

mètode de referència. Segons el tipus de mostres que s’han d’analitzar, serà

preferible fer un major esforç experimental i fer un major nombre de

rèpliques en el procés de verificació de la traçabilitat per no arriscar-se a

emprar un mètode que pot donar resultats esbiaixats. Un exemple és el cas

dels estudis de bioequivalència de dues drogues, en què normalment és

més important no acceptar erròniament que l’activitat de dues drogues és

similar. Això implica assegurar que l’error de segona espècie és baix.69

D’altra banda, també hi ha casos en què pot interessar assegurar que

la probabilitat d’error α sigui baixa. En estudis farmacològics, per exemple,

interessa que el risc de concloure erròniament que una substància actua

com una droga sigui mínim per evitar l’ús de substàncies que no tenen cap

efecte terapèutic.70 Per aquest motiu s’ha suggerit que la postulació de les

hipòtesis nul·la i alternativa es faci segons el tipus d’error que s’ha de

controlar.70


45

També cal destacar que hi ha una relació entre les probabilitats de

cometre un error de primera espècie i un de segona. En el cas d’augmentar

la probabilitat d’error α, disminuirà la probabilitat d’acceptar erròniament

la hipòtesi alternativa, i per tant, la probabilitat d’error β. Com es pot veure

a la figura 1.6, les probabilitats d’error β també depenen de la distància

(biaix, ∆) entre el valor de referència (en aquest cas 0) i l’esbiaixat (en aquest

cas ∆+0) i de la desviació estàndard del coeficient de regressió (0bs ). Així

doncs, per a una probabilitat d’error α i un biaix fixat, existeix una tercera

variable que relaciona les dues probabilitats d’error, el nombre de mostres

de calibració. Augmentant-ne el nombre de mostres pot disminuir la

probabilitat d’error β, ja que d’aquesta manera és possible reduir la

desviació estàndard del coeficient de regressió.

1.6 Aplicacions de la regressió lineal considerant errors en tots

els eixos

Tot i que l’ús més comú de la regressió lineal en l’anàlisi química és la

calibració de mètodes analítics, és a dir, l’establiment de la relació

matemàtica entre les respostes instrumentals i la concentració de l’analit, hi

ha d’altres aplicacions de gran importància en el camp químic.

1.6.1 Calibració de mètodes analítics

Aquest és segurament l’ús més conegut de la regressió lineal en el

camp de l’anàlisi química. En la majoria de casos, les variàncies degudes als

errors en la mesura instrumental són majors a les corresponents variàncies

generades en la preparació de les mostres de calibrat. No obstant això, en

alguns casos, com el de l’aplicació de la fluorescència de raigs-X en mostres

geològiques,25 a causa de la complexitat de les mostres reals, els patrons de

calibrat són substituïts per materials de referència certificats. Els valors de

concentració d’aquests materials van acompanyats per unes variàncies


46

degudes als errors comesos en l’anàlisi de la mostra corresponent. Un altre

cas en què és necessari considerar les variàncies pels errors comesos en la

mesura de les concentracions dels patrons de calibrat, és en les tècniques

que utilitzen datació per radiocarboni, ja que els patrons de calibració

presenten una gran inestabilitat en el temps.

1.6.2 Comparació de mètodes analítics

La comparació de dos o més mètodes analítics a diversos nivells de

concentració pot fer-se mitjançant la regressió lineal. En aquest cas, es

busca comparar si els coeficients de la recta de regressió no són

significativament diferents als valors teòrics que es trobarien si els dos

mètodes en comparació donessin resultats idèntics. Normalment els

resultats dels mètodes en comparació presenten variàncies degudes als

errors en les mesures, del mateix ordre de magnitud. Per aquest motiu les

tècniques de regressió que consideren les variàncies dels errors comesos en

la mesura de les diferents mostres són les més adequades.71

1.6.3 Predicció

En calibració lineal l’etapa de predicció és molt important, ja que

s’empra per trobar el valor de la concentració de mostres desconegudes a

partir de la seva resposta instrumental. En el cas de la comparació de

mètodes analítics, la predicció també té importància perquè de vegades pot

ser interessant conèixer el valor i la incertesa d’una mostra analitzada per

un nou mètode a partir de dels valors obtinguts del mètode ja establert.

1.7 Referències

1.- Draper N., Smith H., Applied Regression Analysis, 2nd ed., John Wiley &

Sons: New York, 1981.

1.7 Referències

47

2.- Galton Sir Francis, Journal of the Anthropological Institute, 15 (1885) 246-

263.

3.- Massart D.L., Vandeginste B.M.G.,. Buydens L.M.C, de Jong S., Lewi

P.J., Smeyers-Verbeke J., Handbook of Chemometrics and Qualimetrics: Part A,

Elsevier: Amsterdam, 1997.

4.- Martens H., Næs T., Multivariate Calibration, Wiley: Chichester, 1989.

5.- Rawlings J.O., Applied Regression Analysis: A Research Tool, Wadsworth &

Brooks/Cole Advanced Books & Software: Belmont, 1988.

6.- Beebe K.R., Kowalski B.R., Analytical Chemistry, 59 (1987) 1007A-1017A..

7.- Fuller W.A., Measurement Error Models, John Wiley & Sons: New York,

1987.

8.- Cheng C.L., Van Ness J.W., Statistical Regression with Measurement Error,

Kendall’s Library of Statistics 6, Arnold: London, 1999.

9.- Cheng C.L., Van Ness J.W., Journal of the Royal Statistical Society, Series B,

56 (1994) 167.

10.- Cheng C.L., Schneeweiss H., Journal of the Royal Statistical Society, Series

B, 60 (1998) 189.

11.- Chan L.K., Mak T.K., Journal of the Royal Statistical Society, Series B, 41

(1979) 263.

12.- Huwang L., Journal of Multivariate Analysis, 55 (1995) 230.

13.- Czapkiewicz A., Applicationes Mathematicae, 25 (1999) 401.

14.- Isogawa Y., Journal of the Royal Statistical Society, Series B, 47 (1985) 211.

15.- Edland S.D., Biometrics, 52 (1996) 243.

16.- Schaalje G.B., Butts R.A., Biometrics, 49 (1993) 1262.

17.- Sprent P., Models in Regression and related topics, Methuen & Co. Ltd.:

London, 1969.

18.- Mood A.M., Garybill F.A., Introduction to the Theory of Statistics,

McGraw-Hill: New York, 1963.

19.- Plackett R.L., Biometrika, 59 (1972) 239-251.

20.- Eisenhart C., Journal of the Washuington Academy of Sciences, 54 (1964) 24.


48

21.- Myers R.H., Classical and Modern Regression with Applications, 2nd ed.,

Duxbury Press: Belmont, 1989.

22.- Irvin J.A., Quickenden T.I., Journal of Chemical Education, 60 (1983) 711-

712.

23.- Meloun M., Militký J., Forina M., Chemometrics for Analytical Chemistry

Volume 2. PC-aided Regression and Related Methods, Ellis Horwood: London,

1994.

24.- Speigelman C.H., Waters R.L., Hungwu L., Chemometrics and Intelligent

Laboratory Systems, 11 (1991) 121.

25.- Bennett H., Olivier G., XRF Analysis of Ceramics, Minerals and Allied

Materials, John Wiley & Sons: New York, 1992.

26.- Clark R.M., Journal of the Royal Statistical Society, Series A, 142 (1979) 47.

27.- Clark R.M., Journal of the Royal Statistical Society, Series A, 143 (1980) 177.

28.- Adcock R.J., Analyst, 4 (1877) 183-184.

29.- Adcock R.J., Analyst, 5 (1878) 53-54.

30.- Kummel C.H., Analyst, 6 (1879) 97-105.

31.- Anderson R.L., Practical Statistics for Analytical Chemists, Van Nostrand

Reinhold: New York, 1987.

32.- Creasy M.A., Journal of the Royal Statistical Society, Series B, 18 (1956) 65-

69.

33.- Mandel J., Journal of Quality and Technology, 16 (1984) 1-14.

34.- Hartmann C., Smeyers-Verbeke J., Penninckx W., Massart D.L.,

Analytica Chimica Acta, 338 (1997) 19-40.

35.- Van Huffel S., Vandewalle J., The Total Least Squares Problem.

Computational Aspects and Analysis, Siam: Philadelphia, 1991.

36.- Dolby G.R., Biometrika, 63 (1976) 39.

37.- Lindley D.V., Journal of the Royal Statistical Society / Series B, 9 (1947)

218-244.

38.- Schafer D.W., Puddy K.G., Biometrika, 83 (1996) 813-824.

39.- York D., Canadian Journal of Physics, 60 (1966) 1079 .

40.- Reed B.C., American Journal of Physics, 60 (1992) 59.

1.7 Referències

49

41.- Reed B.C., American Journal of Physics, 57 (1989) 642.

42.- Williamson J.A., Canadian Journal of Physics, 46 (1968) 1845.

43.- Asuero A.G., González A.G., Microchemical Journal, 40 (1989) 216.

44.- Ogren P.J., Norton J.R., Journal of Chemical Education, 69 (1992) A130.

45.- González A.G., Márquez A., Fernández J., Computers Chemistry, 16

(1992) 25.

46.- Neri F., Saitta G., Chiofalo S., Journal of Physics E, Scientific Instruments,

22 (1989) 215.

47.- Brooks C., Went I., Harre W., Journal of Geophysical Research, 73 (1968)

6071.

48.- Lwin T., Spiegelman C.H., Journal of the Royal Statistical Society C, 35

(1986) 256.

49.- Lybanon M., American Journal of Physics, 52 (1984) 22.

50.- Jefferys W.H., Astronomy Journal, 85 (1980) 177.

51.- Jefferys W.H., Astronomy Journal, 86 (1981) 149.

52.-. Britt H.I., Luecke R.H., Technometrics, 15 (1973) 233.

53.- Powell D.R., MacDonald J.R., Computers Journal, 15 (1972) 148.

54.- Powell D.R., MacDonald J.R., Computers Journal, 16 (1973) 51.

55.- Cumming G.L., Rollett J.S., Rossotti F.J.C., Whewell R.J., Journal of the

Chemical Society, Dalton Transactions, 23 (1972) 2652.

56.- Press W.H., Teukolsky S.A., Computational Physics, 6 (1992) 274.

57.- Clutton-Brock M., Technometrics, 9 (1967) 261.

58.- Barker D.R., Diana L.M., American Journal of Physics, 42 (1974) 224.

59.- Orear J., American Journal of Physics, 52 (1984) 278.

60.- Lisý J.M., Cholvadová A.., Kutej J., Computers Chemistry, 14 (1990) 189-

192.

61.- Riu J., Rius F.X., Journal of Chemometrics, 9 (1995) 343-362.

62.- Carroll R.J., Ruppert D., American Statistician, 50 (1996) 1-6.

63.- Comunicació personal del professor C.L. Cheng, Institut d’Estadística,

Acadèmia Sínica, Taipei, Taiwan, República de China.

64.- Kalantar A.H., Gelb R.I., Alper J.S., Talanta, 42 (1995) 597-603.


50

65.- Analytical Methods Committee, Analyst, 119 (1994) 2363-2366.

66.- Hunter J.S., Journal Association of Official Analytical Chemists, 64 (1996)

574.

67.- Hahn G.J., Meeker W.Q., Statistical Intervals, a Guide for Practitioners,

John Wiley & Sons: New York, 1991.

68.- Günzler H., Accreditation and Quality Assurance in Analytical Chemistry,

Springer-Verlag: Heidelberg, 1996.

69.- Steinijans V.W., Hauscke D., Clinical Research Regulatory Affairs, 10

(1993) 203.

70.- Hartmann C., Smeyers-Verbeke J., Pennickx W., Vander-Heyden Y.,

Vankeerberghen P., Massart D.L., Analytical Chemistry, 67 (1995) 4491.

71.- Riu J., Rius F.X., Analytical Chemistry, 68 (1996) 1851-1857.

CAPÍTOL 2

Falta d’ajust dels punts experimentals a la recta de regressió que considera errors en els dos eixos

2.1 Objectiu del capítol

53


Com s’ha indicat en el capítol anterior, quan hi ha falta d’ajust dels

punts experimentals a la recta de regressió, el model lineal estimat pot no

ser vàlid i es poden obtenir valors sobreestimats de l’error experimental s2.

Per tant, els intervals de confiança construïts per detectar la presència

d’errors significatius en els coeficients de regressió seran més grans.1 Per

aquest motiu la probabilitat de no detectar la presència de diferències

significatives en els coeficients de regressió respecte als valors teòrics

(probabilitat de cometre un error β) augmentarà. Això pot significar, en el

cas de la comparació de mètodes analítics, considerar com a correctes els

resultats d’un mètode analític alternatiu que realment no seria traçable al

mètode de referència.

Aquests motius justifiquen la necessitat de desenvolupar un test

estadístic per detectar la falta d’ajust dels punts experimentals a la recta de

regressió BLS, que considera les incerteses produïdes pels errors comesos

en la mesura de les variables predictora i resposta. L’aplicació d’aquest test

estadístic abans de portar a terme qualsevol test d’hipòtesi sobre els

coeficients de regressió BLS permet fer-se una idea del grau de fiabilitat

que es pot esperar de les conclusions extretes a partir dels tests d’hipòtesis

posteriors. Com s’indica a l’apartat 2.2, cal tenir en compte que quan es fa

servir el mètode de regressió BLS, la falta d’ajust dels valors experimentals

no és tan fàcil de detectar com ho pot ser pel mètode OLS. Això es deu a

que en el mètode BLS la falta d’ajust no només depèn de la distància dels

punts experimentals a la recta de regressió, sinó que a més també depèn de

la magnitud de les incerteses generades pels errors comesos en la mesura

de les dues variables. Aquells punts amb incerteses més grans tendiran a

tenir pitjor ajust a la recta de regressió, ja que la recta de regressió BLS els

dóna menor importància. No obstant això, és probable que aquest tipus de

punts experimentals no contribueixin de manera important a l’índex final

Capítol 2. Falta d’ajust dels punts experimentals ...

54

de la falta d’ajust. D’altra banda, aquells punts experimentals amb

incerteses individuals molt petites tendiran a tenir millor ajust a la recta de

regressió, ja que el mètode BLS dóna més importància a aquest tipus de

valors experimentals, on s’assumeix que els errors comesos en la mesura

són més petits. En aquests tipus de punts, una petita distància respecte la

línia de regressió, pot fer augmentar de manera significativa l’índex de falta

d’ajust. La gran varietat de situacions que es poden donar quant a la

distribució dels valors experimentals o a l’estructura de les seves incerteses

individuals en conjunts de dades reals, fa que la identificació de la falta

d’ajust pel mètode de regressió BLS sigui complex.

A l’apartat 2.2 es fa un recull de les aproximacions més emprades

per detectar falta d’ajust dels punts experimentals a la recta de regressió

OLS. També es presenten els resultats obtinguts per l’aplicació dels tests χ2

i ANOVA desenvolupats pel mètode BLS en conjunts de dades simulats. A

l’apartat 2.3 es presenta el gruix del treball tractat en aquest capítol, com a

part de l’article Lack of fit in linear regression considering errors in both axes,

publicat en la revista Chemometrics and Intelligent Laboratory Systems.

Finalment a l’apartat 2.4, es presenten les conclusions del capítol.

2.2 Possibles aproximacions per la detecció de falta d’ajust

De les diferents aproximacions per detectar falta d’ajust dels valors

experimentals a la recta de regressió, es pot distingir entre el coeficient de

determinació r2,2 el coeficient de qualitat (quality coefficient, QC),3 l’anàlisi

de la variància (analysis of variance, ANOVA)4 i el test χ2,5 sent la primera la

més emprada. El coeficient de determinació r2 es pot expressar com:2

2.2 Possibles aproximacions per la detecció ...

55

∑

∑

=

=

−

−= n

ii

n

ii

yy

yyr

1

2

1

2

2

)(

)ˆ( (2.1)

on y és el valor mitjà de les n variables yi. Aquesta variable mesura la

variació, explicada per la recta de regressió, dels n valors yi al voltant del

seu valor mitjà y . Per tant, r2 pot prendre valors entre 0 i 1 sempre que no

es considerin els valors de les rèpliques a l’hora d’estimar els coeficients de

regressió.2 En cas contrari, r2 no podrà ser igual a 1, ja que cap model lineal,

per molt bé que s’ajusti a les dades experimentals, pot explicar la variació

en les mesures a causa de l’error experimental.2 D’altra banda, diversos

autors6-8 desaconsellen l’ús del coeficient de determinació r2, perquè es

tracta d’un índex numèric sense sentit estadístic, que a més no té en compte

els errors experimentals comesos en la mesura de les diferents rèpliques yij.

Quant al coeficient de qualitat, també mesura el grau d’ajust dels

valors experimentals a la recta de regressió, segons l’expressió:

2

ˆ

100 1

−

−

×=∑

=

ny

yy

QC

n

i

ii

(2.2)

Aquest coeficient és una mesura de l’error que es pot cometre en predir

amb la recta de regressió les concentracions mesurades yi. Així doncs, com

pitjor sigui l’ajust dels punts a la recta de regressió, més gran serà QC.

Aquesta mesura de la falta d’ajust és preferible al coeficient de

determinació r2, ja que proporciona una idea millor de la dispersió dels

valors experimentals i dóna una indicació de l’error que es pot cometre en

la predicció de les concentracions. Aquest coeficient també es pot utilitzar

en models més complexes substituint el denominador de l’equació 2.2 per


56

n-p, on p és el nombre de coeficients del model.4 Tot i això aquesta

aproximació per detectar falta d’ajust continua sense tenir una base

estadística i en el cas de tenir rèpliques, continua sense considerar l’error

experimental comès en la seva mesura.

Pel que fa al test basat en l’anàlisi de la variància, l’aplicació al

mètode de regressió OLS requereix fer rèpliques de les variables resposta

iy per cadascun dels valors de la variable predictora ix , amb 1<i<n i

1<j<pi on pi és el nombre de repeticions en ix . Aquesta aproximació

considera que la suma de residuals al quadrat SSr està formada per dues

parts:2,4

∑∑∑∑∑== == =

−+−=−=n

iiii

n

i

p

jiij

n

i

p

jiijr yypyyyySS

ii

1

2

1 1

2

1 1

2 )ˆ()()ˆ( (2.3)

El primer terme (suma de quadrats degut a l’error pur, SSε) és una mesura

de la variació dels valors de les pi rèpliques fetes sobre la mostra i ( ijy )

respecte al seu valor mitjà iy . Per tant dóna una idea de la incertesa

associada als valors mitjans de la variable resposta iy , a causa dels errors

experimentals comesos en la mesura de les rèpliques. D’altra banda, el

segon terme (suma de quadrats degut a la falta d’ajust, SSlof) mesura la

variació dels valors mitjans de la variable resposta iy al voltant de la recta

de regressió. A la figura següent es representen gràficament aquests dos

termes.

SSε SSlof


57

y

x

xbby 10ˆ +=

1x nxix

3iy2iy

1iy

iy

iy)(2 εSSyy ii −

ii yySS −1)( ε

)(ˆ lofii SSyy −

3)( ii yySS −ε

Figura 2.1. Descomposició de les distàncies residuals entre les rèpliques yij, ( ), el seu valor mitjà yi ( ) i el valor predit iy ( ).

Dividint els termes SSε i SSlof pels graus de llibertat corresponents,

s’obtenen els respectius valors mitjans MSε i MSlof:

2−

=nSS

MS loflof (2.4)

( )∑

=

−= n

iip

SSMS

11

εε (2.5)

Per detectar la falta d’ajust es calcula el quocient εlofcal MSMSF = que es

compara per a un nivell de significança α amb el valor tabulat de la

distribució F, ∑

=

−−α−n

iipn

F1

)1(,2,1. Si el valor Fcal és major que el valor tabulat, es

podrà concloure que hi ha falta d’ajust perquè la variació dels valors


58

mitjans de la variable resposta iy al voltant de la recta de regressió (MSlof)

no pot ser explicada per la variació deguda a l’error experimental pur (MSε)

comès en la mesura de les rèpliques.

Quant al mètode de regressió WLS, l’Analytical Methods Comitee9 va

desenvolupar un test per detectar falta d’ajust basat també en l’anàlisi de la

variància. Aquest és anàleg al descrit pel mètode OLS, ja que fa servir les

mateixes equacions, però ponderant les distàncies representades a la figura

2.1 per les variàncies individuals de la variable resposta ( 2iys ). Pel que fa al

mètode de regressió BLS, atesa la seva equivalència amb els mètodes d’OLS

i WLS sota les condicions apropiades (apartat 1.4.1.2 de la Introducció), en

l’apartat 2.3 d’aquest capítol es presenten les expressions desenvolupades

per detectar la presència de falta d’ajust dels valors experimentals a la recta

de regressió, basant-se en l’anàlisi de la variància.

D’altra banda, també és possible detectar la falta d’ajust dels valors

experimentals a la recta de regressió obtinguda a partir del mètode OLS,

WLS o BLS mitjançant un test χ2. Aquest test assumeix que l'estimació de

l’error experimental s2 és una variable aleatòria que es distribueix segons

una distribució χ2 amb n-2 graus de llibertat.5,9 Quan el valor de l’error

experimental és superior al valor tabulat per un nivell de significança α, 2

2,1 −− nαχ es conclou que existeix falta d’ajust dels punts a la recta de

regressió. No obstant això, la detecció de falta d’ajust mitjançant aquest test

és considerada com aproximada, ja que l’assumpció que l’error

experimental es distribueix segons una distribució χ2 només és correcta

quan el nombre de punts experimentals és elevat,10 condició que rarament

es compleix en regressió lineal.

Per comprovar la capacitat de detecció de falta d’ajust dels tests χ2 i

ANOVA adaptats al mètode de regressió BLS en diferents tipus de dades,


59

es faran servir conjunts de dades simulats. Aquests es generen mitjançant el

mètode de Monte Carlo,11,12 tal com s’explica en la secció Validation Process

de l’apartat 2.3 d’aquest capítol. Sobre cada un d’aquests conjunts

s’apliquen els dos tests estadístics per a un determinat nivell de

significança α. En el cas que es detecti falta d’ajust en un α% dels conjunts

de dades generats a partir d’un conjunt inicial on se simula un ajust

perfecte de les dades experimentals a la recta de regressió, es podrà

concloure que les expressions teòriques desenvolupades són correctes. La

figura següent representa de manera gràfica el procés de validació per un

conjunt de dades inicial heteroscedàstic simulant un ajust perfecte dels

valors experimentals a la recta de regressió.

MonteCarlo

1

······

2

3

100.000

Conjunt de dades inicial

x

y

······

Test F (α)

Test χ2 (α)

Test F (α)

Test χ2 (α)

Test F (α)

Test χ2 (α)

Test F (α)

Test χ2 (α)

Falta d’ajust?

Falta d’ajust?

No

No

Falta d’ajust?

Falta d’ajust?

Sí

No

Falta d’ajust?

Falta d’ajust?

No

No

Falta d’ajust?

Falta d’ajust?

No

Sí

Test F (α)

Test χ2 (α)

Falta d’ajust?

Falta d’ajust?

?α=

x

y

x

y

x

y

x

y % Falta ajustTest χ2

% Falta ajustTest χ2

Figura 2.2. Procés de validació mitjançant el mètode de Monte Carlo.

Aquest procés de validació es va aplicar sobre diferents tipus de

conjunts inicials descrits a la secció Experimental (Experimental Section) de

l’apartat 2.3 d’aquest capítol. La taula següent recull els resultats sobre

l’existència de falta d’ajust obtinguts a partir de les expressions

desenvolupades pel mètode de regressió BLS, així com de les ja existents

pel mètode de regressió WLS.9 Per proporcionar més claredat a la taula 1


60

només es presenten els resultats obtinguts dels conjunts de dades amb 7

punts en el cas més complex, és a dir, per conjunts inicials amb

heteroscedasticitat aleatòria. Aquests conjunts es troben representats en la

figura 1 de l’apartat 2.3. A la columna ‘Rèpliques’ es presenten el nombre

de repeticions generades de cada punt del conjunt inicial, a partir de les

quals es van calcular els nous valors mitjans dels punts ( ix , iy ) i les seves

desviacions estàndard ixs i

iys . Per estudiar l’efecte d’un nombre gran de

rèpliques (representat pel símbol ‘∞’), es van simular conjunts de dades en

què les desviacions estàndard per a cada nou punt ( ix , iy ) eren iguals a les

desviacions estàndard associades als punts en el conjunt de dades inicial.

α Rèpliques Test F(BLS) Test χ2 (BLS) Test F(WLS) Test χ2 (WLS) 10 3 19.5 28.9 35.1 44.6

6 14.7 17.8 21.0 24.9 ∞ 9.9 10.1 10.1 10.1

5 3 13.0 21.7 26.8 36.9 6 8.4 11.0 13.4 17.2 ∞ 4.9 4.9 4.9 5.0

1 3 5.1 11.8 13.7 25.8 6 2.5 4.1 5.3 8.4 ∞ 0.9 1.1 1.0 1.1

Taula 1. Percentatge de vegades en els que es detecta falta d’ajust en els 100.000 conjunts de dades simulats, mitjançant els dos tests estadístics.

Com es pot observar a la taula 1, per als nombres de rèpliques més

baixos el percentatge de conjunts de dades simulats en què es va detectar

falta d’ajust, tant amb els tests desenvolupats per BLS com amb els ja

coneguts per WLS,9 és superior al nivell de significança α fixat en cada cas

perquè el nombre de rèpliques és insuficient per donar estimacions

correctes de les incerteses del conjunt inicial. Així doncs, en aquells punts

on les incerteses estimades (variació deguda a l’error experimental pur)


61

siguin inferiors a les incerteses inicials (vertaderes) serà més difícil explicar,

segons el test F, la variació dels valors mitjans ( ix , iy ) al voltant de la recta

de regressió. Aquests punts amb incerteses subestimades (amb una major

probabilitat de donar-se com menor sigui el nombre de rèpliques) tindran

una contribució molt important a la variable Fcal i, per tant, faran

augmentar les probabilitats de detectar falta d’ajust. És per això que, com

reflecteix la taula 1, el nombre de rèpliques generades per estimar els nous

punts simulats i les seves incerteses individuals és fonamental per a la

detecció correcta de la falta d’ajust. En aquesta taula es pot veure que el

mètode WLS és més sensible a aquest efecte que no pas el mètode BLS.

Això és degut a que mentre que pel mètode WLS amb les rèpliques

generades s’estima només una desviació estàndard (iys ) per punt, pel

mètode BLS se’n estimen dues (ixs i

iys ). Per tant, és més probable que en

estimar una sola desviació estàndard, aquesta pugui estar subestimada (cas

WLS) que no pas ho estiguin les dues a la vegada (cas BLS).

D’altra banda, també s’ha de destacar que els resultats del test χ2 en

la taula 1 són sempre superiors als obtinguts pel test F pels diferents nivells

de significança α. Això demostra que el test χ2 detecta erròniament falta

d’ajust en més casos que el test F, quan el nombre de rèpliques a partir del

qual s’estimen les variàncies dels nous punts simulats és limitat. Aquest

resultat és degut al fet que, com ja s’ha dit anteriorment, l’assumpció que

l’error experimental s2 es distribueix segons una distribució χ2 només és

correcta quan el nombre de punts experimentals és prou elevat.10 A més,

igual que en el test F, els resultats obtinguts del test χ2 pel mètode de

regressió WLS són sempre superiors als del mètode BLS. Com s’ha explicat

anteriorment, el mètode WLS necessita l’estimació d’una sola desviació

estàndard per punt, i per tant, és més probable que aquesta pugui estar

subestimada que no pas ho estiguin dues a la vegada en el cas del mètode

BLS. Això significa que pel mètode WLS la suma de residuals ponderats i,


62

per tant, l’error experimental s2 tindran una major probabilitat d’estar

sobreestimats. Per aquesta raó serà més probable que pel mètode WLS se

superi el valor crític 22,1 −− nαχ per un nivell de significança α i, en

conseqüència, que es detecti falta d’ajust.

Finalment, quan s’associa als nous punts generats les mateixes

variàncies que als punts del conjunt inicial (se simula un gran nombre de

rèpliques, símbol ‘∞’), el percentatge de casos en què es detecta falta d’ajust

és aproximadament igual al nivell de significança fixat en cada cas. Això

demostra que si es tenen bones estimacions dels valors experimentals

( ix , iy ) i de les incerteses individuals, les expressions desenvolupades per

detectar falta d’ajust pel mètode BLS, igual que les ja conegudes pel mètode

WLS,9 són correctes.

2.3 Chemom. Intell. Lab. Syst., 54 (2000) 61-73

63

2.3 Lack of fit in linear regression considering errors in both

axes (Chemometrics and Intelligent Laboratory Systems, 54

(2000) 61-73).

Àngel Martínez*, Jordi Riu, F. Xavier Rius

Department of Analytical and Organic Chemistry.

Institute of Advanced Studies. Universitat Rovira i Virgili.

Pl. Imperial Tarraco, 1. 43005-Tarragona. Spain.

ABSTRACT

Testing for lack of fit of the experimental points to the regression

line is an important step in linear regression. When lack of fit exists,

standard deviations for both regression line coefficients are overestimated

and this gives rise, for instance, to confidence intervals that are too large. If

these confidence intervals are then used in hypothesis tests, bias may not be

detected so there is a greater probability of committing a β error. In this

paper we present a statistical test which analyses the variance of the

residuals from the regression line whenever the data to be handled have

errors in both axes. The theoretical expressions developed were validated

by applying the Monte Carlo simulation method to two real and nine

simulated data sets. Two other real data sets were used to provide

examples of application.

INTRODUCTION

Linear regression has two fundamental uses in analytical chemistry:

it relates the instrumental responses to the analyte concentration (i.e. it

establishes the calibration line within the quantitative analytical process)

and compares analytical methodologies over a set concentration range. For


64

some analytical methods, such as X-ray fluorescence (XRF), certified

reference materials (CRM) are often used as calibration standards because

real samples (i.e. geological materials) [1] are too complex. For this reason

uncertainties are associated to both CRM concentration values and

instrumental responses (predictor and response variables) and thus placed

on the x and y axis respectively. In method comparison studies replicate

measurements of a set of samples containing the analyte of interest at

different concentration levels, are carried out by the two methods to be

compared. Results can be placed in both axes with their respective

uncertainties and regressed on each other. In this way, bias in the method

being tested can be detected, for instance, by using the joint confidence

interval test for the slope and the intercept of the regression line which was

obtained considering the errors in both axes [2].

Ordinary least-squares (OLS), or weighted least-squares (WLS)

which considers heteroscedasticity in the response variable, are probably

the most widely used regression techniques. However, they are of limited

scope because they consider that the x axis is free of error. For this reason,

OLS and WLS should not be applied in the cases described above since the

uncertainties associated to the results in both axes are habitually of the

same order of magnitude. An alternative may be the errors-in-variables

regression [3], also called constant variance ratio (CVR) approach [4-6]. This

regression method considers the errors in both axes but does not take into

account the individual uncertainties of each experimental point. It also

concludes that the ratio of the variances of the response and predictor

variables is constant for every experimental point (λ=sy2/sx

2). A particular

case is the orthogonal regression method (OR) [7], in which the errors are of

the same order of magnitude in the response and in the predictor variables

(i.e. λ=1). Bivariate least squares (BLS) regression techniques [8,9], are

another option because they take into account individual non-constant

errors in both axes to calculate the regression coefficients.


65

It is essential to check whether lack of fit exists before a statistical

hypothesis test is applied to the regression line coefficients. Confidence

intervals calculated with data which contain lack of fit lead to oversized

regions [10]. The use of these confidence regions in any statistical

hypothesis test may for the BLS regression method, not allow the detection

of constant or proportional bias in the calibration line [11], or in the case of

method comparison studies, may lead to wrongly considering the results

from an alternative analytical method as unbiased (i.e. β error) [12]. To

prevent these misleading situations in OLS and WLS regression techniques,

residual plots can be used together with statistical tests to detect lack of fit

[13,14]. For this reason, in this paper we present a statistical test which is

adapted to detect lack of fit under BLS regression conditions. This test is

based on the analysis of the variance of the residuals from the regression

line (Anova) obtained when errors in both axes are taken into account.

Two real data sets were used in the validation process to check if the

conclusions reached about the existence of lack of fit under OLS, WLS and

BLS regression conditions were similar. Simulated data sets randomly

generated from nine different initial data sets using the Monte Carlo

method [15,16], were also used to validate the theoretical expressions

proposed. In addition, two more real data sets were considered to provide

real examples of detecting of lack of fit in experimental data which have

errors in both axes, using the test based on the analysis of the variance of

the residuals.


66

BACKGROUND AND THEORY

Notation

In general, the true values of the variables used throughout this

study are represented with Greek characters, and their estimates are

represented with Latin ones. Thus the true values of the BLS regression

coefficients are written β0 (intercept) and β1 (slope), while their respective

estimates are written as b0 and b1. The estimates of the standard deviation

of the intercept and the slope for the BLS regression line, are written as 0bs

and 1bs respectively. The true experimental error (residual mean square

error), expressed in terms of variance for the n experimental data pairs

(xi,yi), is referred to as 2σ , while its estimate is 2s . Values xi and yi of each

experimental data pair are the mean values of the pi replicate

measurements of the ith sample xij and yij (1<j<pi) by both methods.

Predictions of the experimental mean values xi and yi are symbolised as ix

and iy .

Bivariate Least-Squares Regression (BLS)

Of all the least squares approaches for calculating the regression

coefficients when there are errors in both axes, Lisý’s method [8] (referred

to as BLS) was found to be the most suitable [9]. This technique assumes

the true linear model to be:

ii ξββη 10 += (1)

The true variables ξi and ηi are unobservable. Only the experimental

variables can be observed:


67

iiix δξ += (2)

iiiy γη += (3)

The random errors committed in the measurement of variables xi and yi, are

represented by variables δi and γi, where ),0(N~ 2ixi σδ and ),0(N~ 2

iyi σγ .

In this way, when eqs. 2 and 3 are introduced in eq. 1 and the variable yi is

isolated, the following expression is obtained:

iii xy εββ ++= 10 (4)

The term εi is the ith true residual error with ),0(N~ 2ii εσε [17] and can be

expressed as a function of δi, γi and β1:

iii δβγε 1−= (5)

To estimate the regression line coefficients whenever there are

errors in both variables, several authors have developed procedures based

on a maximum likelihood approach [3,18-20]. In most cases these methods

need the true predictor variable to be carefully modelled [18]. This is not

usually possible in chemical analysis, where the true predictor variables iξ

are not often randomly distributed (i.e. functional models are assumed).

Moreover there are cases in which the experimental data is heteroscedastic

and estimates of measurement errors are only available through replicate

measurements (i.e. the ratio ii yx σσ can be non-constant or unknown).

These conditions, common in chemical data, make it very difficult to

rigorously apply the principle of maximum likelihood to the estimation of

the regression line coefficients. On the other hand, there is a method to

estimate the regression coefficients using a maximum likelihood approach


68

even when a functional model is assumed [17]. This method is not

rigorously applicable when individual heteroscedastic measurement errors

are considered. It has been shown that when it is assumed that ii yx λσσ =

for any i, least squares methods provide the same estimates of the

regression coefficients as maximum likelihood estimation approaches [21].

For these reasons, we have chosen an iterative least squares method (i.e. the

BLS method) that can be applied to any group of ordered pairs of

observations with no assumptions about the probability distributions [21].

This means that this method can be applied to real chemical data when

individual heteroscedastic errors in both axes are considered. In this way,

the BLS regression method relates the observed variables xi and yi as

follows [22]:

iii exbby ++= 10 (6)

The term ei is the observed ith residual error. The variance of ei is 2ies and

will be referred to as the weighting factor. This parameter takes into

consideration the experimental variances of any individual point in both

axes ( 2ixs and 2

iys ) obtained from replicate analysis. The covariance between

the variables for each (xi,yi) data pair, which is normally assumed to be

zero, is also taken into account:

)(cov2)var( 1

221

210

2iixyiie yxbsbsxbbys

iii−+=−−=

(7)

The BLS regression method finds the estimates of the regression line

coefficients by minimising the sum of the weighted residuals, S which is

known to follow a χ2 distribution with n-2 degrees of freedom [23]:


69

2

12

2

12

2

2

2

)2()ˆ()ˆ()ˆ( sns

yys

xxs

yySn

i e

iin

i x

ii

y

ii

iii

−=−

=−

+−

= ∑∑==

(8)

where s2 is the estimate of the residual mean squared error, also known as

experimental error. Therefore the BLS regression technique assigns less

importance to those data pairs with larger 2ixs and 2

iys values, that is to say,

the most imprecise data pairs. By minimising the sum of the weighted

residuals (eq. 8), two non-linear equations are obtained, from which the

regression coefficients b0 and b1 can be estimated by means of an iterative

process [2].

It should be noted that the BLS regression method is equivalent to

the WLS and OLS methods in the appropriate regression conditions [2].

Thus, when uncertainty is only available for the experimental values on the

y-axis, estimates of the BLS regression line coefficients are the same as those

estimated with the WLS regression technique. This is because the BLS

weighting factor (eq. 7) reduces to the one assumed by the WLS method,

that is 22ii ye ss = when the uncertainties in the x-axis are zero [14]. On the

other hand, when null and constant uncertainties (homoscedasticity) are

considered in the x and y axes respectively (OLS regression conditions), the

BLS weighting factor in eq. 7 becomes a constant, which means that the

sum of the weighted residuals (eq. 8) minimized by BLS becomes identical

to the one assumed by the OLS regression method. For this reason, the

estimates of the BLS regression line coefficients under homoscedastic

conditions are the same as the ones from the OLS regression method.

Lack of fit

Lack of fit of the experimental points to the regression line under

BLS regression conditions may not be as easy to recognise as it may be


70

under other regression methods that do not account for individual

uncertainties in the experimental data (e.g. OLS). This is because under BLS

conditions lack of fit not only depends on the distance of the experimental

points from the regression line but also on the magnitude of the

uncertainties on both axes for each individual data pair. So, data pairs with

larger uncertainties will tend to be further from the BLS regression line

since this regression method does not give much importance to low-

precision data pairs. This kind of situation, however, may not make an

important contribution to the overall lack of fit index. On the contrary, data

pairs with low individual uncertainties should lie near the BLS regression

line. This is because the BLS method gives more importance to high-

precision data pairs, that are supposed to have lower measurement errors.

In these cases, low deviations from the regression line may make an

important contribution to the overall lack of fit index. The wide variety of

situations that can arise when real heteroscedastic data is used, make it

very difficult to observe the existence of lack of fit and identify its possible

causes (dispersion of the data, outliers or non-linearities). This means that

a lack of fit test is required that can be used under BLS regression

conditions.

When no lack of fit exists in the BLS regression line, the observed

linear model (eq. 6) can be assumed to be correct and the weighted

residuals can be assumed to follow a normal distribution with mean 0 and

standard deviation iεσ . If, however, lack of fit is present, the regression

model may not be correct and the residual mean square error s2 (eq. 8) will

tend to be overestimated and may not provide a right measure of the

random variation present in the experimental data pairs [12]. In a work by

Williamson [23], goodness of fit was tested when errors were considered to

be present in both axes by applying a χ2 test on the residual mean square

error estimate s2. This is a random variable that can be approximated by a

χ2 distribution with n-2 degrees of freedom. However, this is regarded as a


71

rough test for detecting lack of fit, and the test based on the analysis of the

residual variance [13] is habitually preferred. This is because a chi-squared

distribution is justified by the asymptotic theory only in large samples [24].

This condition is not usually met in linear regression, where the number of

samples is limited.

As well as using a statistical test for detecting lack of fit, it is also

advisable to take a look at the plot of the weighted residuals from the BLS

regression line [25]. This plot provides a view of the individual residual of

each experimental point corrected by the corresponding weighting factor

(eq. 7). So it gives a better view of the data structure and hence of the

possible causes of lack of fit of the experimental points to the regression

line (low-precision data, outliers or non-linearities).

Test for detecting lack of fit under BLS conditions

This lack of fit test is based on an analysis of the variance of the

residuals (Anova). Given the equivalence between the BLS, WLS and OLS

regression methods under the appropriate experimental conditions, the

expressions for the lack of fit test developed for the BLS method should be

analogous to those for the OLS [14] and WLS [13] regression techniques. In

this way, the variation of n ix and iy (1<i<n) group means around the

regression line (sum of squares due to lack of fit, SSlof) is compared with the

variance of the n data pairs due to pure experimental uncertainty (sum of

squares due to pure error, SSε), generated by pi replicate measurements on

each sample. These two sources of variation are included in the residual

sum of squares from regression, or total sum of squares (SSr) [14]. The new

expressions take into account the residuals in both x and y axes to evaluate

SSr according to the next equation:


72

2

2

1 12

2 )ˆ()ˆ(

i

i

i x

iijn

i

p

j y

iijr s

xxs

yySS

−+

−= ∑∑

= =

(9)

This expression provides the sum of the weighted squared distances

between each single replicate measurement ijx or ijy and the

corresponding predicted mean value ix or iy . Analogously to the sum of

weighted residuals S, variable SSr can also be assumed to follow χ2

distribution with 21

−

∑=

n

iip degrees of freedom, since the only difference

between equations 8 and 9 are the replicate measurements ijx and ijy , that

can be assumed to be normally distributed around the mean values ix and

iy [26]. The residual sum of squares from regression accounts for both the

lack of fit of the experimental mean values xi and yi around the BLS

regression line and the dispersion of the pi replicate measurements around

their respective experimental mean values in both axes. Because these

distances are divided by the individual variances, less importance is

assigned to those data pairs with higher uncertainties in both axes (i.e. the

most imprecise data pairs) and viceversa.

The sum of squares due to lack of fit (SSlof) included in the total sum

of squares in eq. 9 can be expressed as:

−+

−= ∑

=2

2

2

2

1

)ˆ()ˆ(

ii x

ii

y

iin

iilof s

xxs

yypSS (10)

This equation gives the sum of the weighted squared distances between the

experimental and the predicted mean values in both axes, which is due to

lack of fit of the data pairs around the BLS regression line. Since eq. 10 is

very similar to eq. 8, it is clear that the variable SSlof, like the sum of


73

weighted residuals S, follows a χ2 distribution, with n-2 degrees of

freedom. It should be noted that a data pair with a high number of

replicates is likely to have lower individual uncertainties, and thus show a

better fit to the BLS regression line than another data pair with the same

experimental mean value obtained with a lower number of replicates. To

offset this effect, the term pi in eq. 10 gives greater importance to the

residuals of those data pairs with a higher number of replicates. Finally, the

sum of squares due to pure error (SSε) accounted by eq 9. can be calculated

by subtracting the sum of squares due to lack of fit in eq. 10:

SSε = SSr - SSlof (11)

Because variables SSr and SSlof follow a χ2 distribution, it is clear from eq.

11 that the sum of squares due to pure error SSε, also has a χ2 distribution

with ∑=

−n

iip

1

)1( degrees of freedom. An F-test can be used to compare the

sums of squares SSlof and SSε because they both follow a χ2 distribution

with ∑=

−n

iip

1)1( and n-2 degrees of freedom respectively [26]. To apply the

F -test these two variables (eqs. 10 and 11) first have to be divided by the

appropriate degrees of freedom:

2−

=nSS

MS loflof (12)

( )∑

=

−= n

iip

SSMS

11

εε (13)

The F-ratio is therefore given by:


74

εMS

MSF lof

cal = (14)

If no lack of fit exists, Fcal can be expected to be a random variable

drawn from an ∑

=

−−n

iipn

F1

)1(,2 distribution. In this case, Fcal will be lower than

the corresponding ∑

=

−−α−n

iipn

F1

)1(,2,1 tabulated value for a given level of

significance α. It should be pointed out that correctly estimating of the

individual uncertainties 2ixs and 2

iys is very important. If the uncertainties

of the points are extremely low, the regression line will tend to perfectly fit

these points. However, very slight deviations from the regression line may

cause lack of fit to be detected in the data set. This is because the terms 2ixs

and 2iys that appear in the denominator of eqs. 9 and 10 make the Fcal value

in eq. 14 very sensitive to small deviations of abnormal high-precision data

pairs from the regression line. Although these situations are not frequent in

experimental data, one should pay special attention to the fact that

repeated measurements should comprise all the experimental variability of

the measurement.

Validation Process

Since the theoretical expressions 9-14 are a result of adapting

expressions obtained from OLS and WLS regression methods, they need to

be validated. For this reason, a validation process was designed to prove

that correct results are provided by the theoretical expressions for the lack

of fit test considering BLS regression conditions (i.e. lack of fit is detected

when it exists and not detected when it does not). Two strategies were

followed to carry out this validation.


75

The first one was designed to validate the theoretical expressions

under OLS, WLS and BLS regression conditions using two real data sets.

Uncertainties in both axes for both data sets were modified so that

individual uncertainties in the y axis were approximately constant (i.e.

homoscedasticity) and much higher than the ones in the x axis. This

uncertainty structure, imposed for the BLS regression method, is similar to

the one assumed by WLS (heteroscedastic and null uncertainties in the y

and x axes respectively) and OLS (constant and null uncertainties in the y

and x axes respectively) methods. In this way, if eqs. 9-14 were correct,

conclusions about the existence of lack of fit reached using the lack of fit

test considering uncertainties in both axes should be similar to the ones

reached under OLS or WLS conditions.

The second strategy checked whether lack of fit could be correctly

detected with the expressions developed under BLS conditions using

simulated data. Nine initial simulated data sets in two different groups

were considered; in six of them all the data pairs perfectly fitted an straight

line (simulating no lack of fit) and in the remaining three they fitted a curve

(simulating lack of fit due to a non-linearity). From these two groups of

initial simulated data sets, and using the Monte Carlo method [15,16],

100,000 new simulated data sets were generated. This simulation method

adds a random error to each initial data pair based on the individual

uncertainties present in both axes. In this way, replicates are randomly

generated for each data pair in every new simulated data set. The new xi

and yi values for each data pair were calculated from the mean value of the

pi simulated replicates xij and yij. The individual uncertainties in both axes

for each new data pair (i.e. 2ixs and 2

iys ) were considered equal to the true

uncertainties from the data pairs in the initial simulated data sets. This

ensured that possible errors in the detection of lack of fit from the

validation process could only be due to the theoretical expressions and not


76

produced by an inaccurate estimation of the individual uncertainties in

both axes (if calculated from the corresponding pi simulated replicates).

When lack of fit was detected in approximately an α% of the data

sets generated from an initial data set which simulated no lack of fit for a

level of significance α, it could be concluded that the theoretical

expressions provided correct results (i.e. lack of fit is only detected in an

α% of the cases, when it does not exist). This conclusion would be

confirmed if lack of fit were to be detected in most of the data sets

generated from an initial data set which simulated lack of fit.

EXPERIMENTAL SECTION

Data sets and software

Nine initial simulated data sets with different characteristics (such

as number of data pairs or uncertainty patterns, Figs. 1a-1i) and two real

data sets (Figs. 2a and 2b) were used for the validation process. In six of the

simulated data sets goodness of fit was simulated by perfectly fitting all the

data pairs to a straight line (Figs. 1a-1f), whereas in the other three all the

data pairs followed a non-linear pattern simulating lack of fit (Fig. 1g-1i).

Moreover two supplementary real data sets (Figs. 2c and 2d) were also

used as application examples of the Anova test for detecting lack of fit

under BLS regression conditions.

Simulated data sets

Number of data pairs: Three data sets were composed of seven data

pairs (Figs. 1a-1c) and the other six contained twenty-one data pairs each

(Figs. 1d-1i). In all cases the data pairs were randomly distributed within a

linear range from 0 to 100 units.


77

Uncertainties: Homoscedastic data sets were composed of data pairs

with constant standard deviations (Figs. 1a, 1d and 1g). The heteroscedastic

data sets were divided into two different groups: those with standard

deviations which increased by 10% (Figs. 1b, 1e and 1h) for each individual

xi and yi value and those with random standard deviations (Figs. 1c, 1f and

1i) that were never higher than 10% of each individual xi and yi value.

0 20 40 60 80 1000

10

20

30

40

50

60

70

80

90

100

X Data

Y D

ata

a)

0 20 40 60 80 100 1200

20

40

60

80

100

120

X Data

Y D

ata

b)

0 20 40 60 80 100 1200

20

40

60

80

100

120

X Data

Y D

ata

c)

-20 0 20 40 60 80 100 120

0

20

40

60

80

100

120

X Data

Y D

ata

d)

-20 0 20 40 60 80 100 120

0

20

40

60

80

100

120

X Data

Y D

ata

e)

-20 0 20 40 60 80 100 120

0

20

40

60

80

100

120

X Data

Y D

ata

f)

-20 0 20 40 60 80 100 120

0

20

40

60

80

100

120

140

X Data

Y D

ata

g)

-20 0 20 40 60 80 100 120-20

0

20

40

60

80

100

120

140

160

X Data

Y D

ata

h)

-20 0 20 40 60 80 100 120-20

0

20

40

60

80

100

120

140

X Data

Y D

ata

i)

Figure 1. Initial data sets used to generate of simulated data sets in the validation process. Crosses around the data pairs represent the standard deviations associated to each individual mean value in both axes.


78

For each of the nine simulated data sets, three levels of significance

were considered; 10%, 5% and 1%.

Real data sets

Two real data sets were used to validate the expressions proposed

for the lack of fit test under BLS regression conditions. In these data sets,

the individual uncertainties were modified so that they had a structure

similar to the ones assumed by OLS or WLS regression methods. In

addition two other supplementary real data sets were used to provide real

application examples of the test for detecting lack of fit when errors were

considered in both axes .

Data Set 1 [27]. This data set was composed of eight data pairs

generated from the determination of eight polycyclic aromatic

hydrocarbons in various environmental matrices at different concentration

levels through a stepwise interlaboratory study approach. The two

analytical methods being compared are GC/MS-SIM (on the x axis) and

GC-ECD (on the y axis). The linear range is between zero and 6 µg/g (Fig.

2a).

Data Set 2 [28]. Comparative study of two multiresidue methods for

the determination of organochlorine insecticides and polychlorinated

biphenyl congeners in fatty processed foods. Eight data pairs represent the

results from the analysis of α-endosulfan using HPLC (results on the x axis) and GC (results on the y axis). The linear interval is between eighty

and a hundred and ten units presented as percentage of recovery (Fig. 2b).


79

-1 0 1 2 3 4 5 6 7

0

1

2

3

4

5

6

7

GC-MS/SIM

GC

-EC

D

a)

80 85 90 95 100 105 110

70

75

80

85

90

95

100

105

110

HPLCG

C

b)

Figure 2. Regression lines for the two real data sets used in the validation process under BLS (solid line), WLS (dashed line) and OLS (dotted line) conditions; (a) data set 1 and (b) data set 2. Crosses around the data pairs represent the standard deviations associated to each individual mean value in both axes.

Data Set 3 [29]. Twelve data pairs distributed between eighty and a

hundred and five units presented as percentage of recovery. Results are

obtained by analysing two synthetic pyrethroid residues (permethrin and

cypermethrin) in fruits, vegetables and grains at six concentration levels for

each analyte using two chromatographic methods based on GC-ECD with

an acetonitrile extraction system and two different types of columns; wide

bore (on the x axis) and narrow bore (on the y axis). Uncertainties in both

axes for each data pair are the result of a six measurement analysis. Units

are expressed as a percentage of recovery (Fig. 2c).

Data Set 4 [30]. The composition of this data set is similar to the one

described for data set 3. In this case, however, the solvent used in the

extraction system is acetone and the number of synthetic pyrethroid

residues analyzed at different concentration levels is increased to eight.

This produces thirty-three data pairs distributed between eighty and a

hundred units expressed as a percentage of recovery (Fig. 2d).


80

80 85 90 95 100 105

80

85

90

95

100

105

110

115

GC-ECD wide bore

GC

-EC

D n

arro

w b

ore

c)

75 80 85 90 95 100 105

75

80

85

90

95

100

105

110

GC-ECD wide boreG

C-E

CD

nar

row

bor

e

d)

115

110

Figure 2 (cont.). BLS Regression lines for real data sets 3 (c) and 4 (d). Crosses around the data pairs represent the standard deviations associated to each individual mean value in both axes.

All the computational work was performed with home-made

Matlab subroutines (Matlab for Microsoft Windows ver. 4.0, The

Mathworks, Inc., Natick, MA).

RESULTS AND DISCUSSION

Validation Process

Table 1 shows the results concerning the existence of lack of fit in

the two real data sets used in the validation process (data sets 1 and 2). In

all the cases the level of significance α was set at 5%. The calculated and

tabulated values of the F parameter are in the ‘Fcal’ and ‘Ftab’ columns for the

three regression methods (‘Reg.’ column). In this way lack of fit was

detected when the Fcal values were higher than the tabulated values for the

given level of significance. The number of degrees of freedom are

summarised in the ‘d.o.f.’ column; the first and second values correspond

to the number of degrees of freedom of the numerator and the denominator

in eq. 14 respectively. The columns ‘n’ and ‘p’ present the number of


81

samples and measurements per sample (considered constant in the cases

studied) respectively.

Data Set n p d.o.f. Reg. Fcal Ftab

1 8 11 6, 80 BLS 1.15 2.21

WLS 0.96

OLS 0.69

2 8 3 6, 16 BLS 3.40 2.74

WLS 2.82

OLS 2.83 3 12 6 10, 60 BLS 0.33 1.99 4 33 6 31, 165 BLS 2.02 1.42

Table 1. Results of applying the lack of fit test to the four real data sets for a level of significance of 5%.

Data Set 1. As can be seen in Table 1, the application of the lack of fit

test in this data set under BLS, WLS and OLS regression conditions showed

that no lack of fit was detected in any case. The weighted residual plots for

the three regression methods show considerable similarity (Fig. 3). This is

because the structure of the uncertainties in both axes (under BLS

conditions) is very similar to the structure assumed by WLS and OLS

methods. In the three cases the fifth data pair is the only one between the

warning and the action limits (twice and three times the sum of the

residuals, S in eq. 8, respectively). Figure 2a shows that this data pair is not

only the furthest from the regression line, but also has one of the lowest

individual uncertainties and thus is one of the least imprecise data pairs.

This forces most of the data pairs to be placed above the regression line.

The weighted residual of this data pair is higher than the others and this

suggests that this point might be considered as an outlier. For this reason, a

test for detecting outliers under BLS regression conditions is needed [25].


82

0 1 2 3 4 5 6

Concentration Level

0

S3

S2

S3−

S2−

Figure 3. Residual plots for data set 1 under BLS ( ), WLS ( ) and OLS ( ) regression conditions. Residuals were weighted for BLS and WLS according to the different regression conditions.

Data Set 2. In this example lack, of fit was detected by the lack of fit

test under the three regression conditions (Table 1). As in the previous

example the weighted residual plots (Fig. 4) show considerable similarity.

In the three cases, there are two data pairs (second and third) that are near

the warning limits and might be the cause of lack of fit. Figure 2b shows

that because of the high degree of homoscedasticity in the data set, the data

pairs with the highest weighted residuals are those which are further from

the regression line. As in the previous example, they are near the warning

limits in the weighted residual plot (Fig. 4) and need not, therefore, be

immediately eliminated but it would be interesting to check if they can be

considered as outliers under each of the three regression conditions.


83

80 85 90 95 100 105 110

0

S3

S2

S3−

S2−

Concentration Level

Figure 4. Residual plots for data set 2 under BLS ( ), WLS ( ) and OLS ( ) regression conditions. Residuals were weighted for BLS and WLS according to the different regression conditions.

These two examples demonstrate that the conclusions reached

about the existence of lack of fit are similar to the conclusions reached

under WLS and OLS conditions, when the structure of the uncertainties in

both axes is similar to the strucutre assumed by the WLS and OLS

regression methods. This suggests that the expressions developed for the

lack of fit test which consider errors in both axes provide results which are

consistent with those obtained under OLS or WLS conditions.

Table 2 summarises the percentages of the 100,000 simulated data

sets generated using the Monte Carlo method in which lack of fit was

detected (‘l.o.f.’ column) for the nine initial simulated data sets at the three

levels of significance. The three different uncertainty patterns considered

(homoscedasticity and constant and random heteroscedasticity) are

summarized in the ‘Uncertainty’ column.


84

n Uncertainty α% l.o.f. n Uncertainty α% l.o.f. 7 homo. 10 9.92 hetero. rnd. 10 9.68

5 5.13 5 4.57 1 0.94 1 1.12 hetero. 10 9.87 21* homo. 10 96.23 5 4.96 5 92.55 1 0.95 1 86.27 hetero. rnd. 10 9.52 hetero. 10 97.85 5 4.93 5 94.25 1 0.92 1 88.46 21 homo. 10 9.95 hetero. rnd. 10 98.93 5 4.84 5 96.32 1 0.92 1 90.53 hetero. 10 9.86 5 4.84 1 1.12

Table 2. Percentages of detection of lack of fit in simulated data sets with homoscedasticity (homo.), proportional heteroscedasticity (hetero.) and random heteroscedasticity (hetero. rnd.) during the validation process. The symbol (*) denotes the existence of lack of fit in the initial simulated data set.

A paired t-test [31] (with α=5%) was applied to the results obtained

for the different uncertainty patterns of the data sets which simulated no

lack of fit in Table 2.The differences between the lack of fit percentages and

the percentages of α were not significant. So, the percentages of cases in

which lack of fit was wrongly detected do not significantly differ from the

different levels of significance α that set the probability of wrongly

detecting lack of fit. For this reason, it can be concluded that the theoretical

expressions adapted to perform the lack of fit test whenever errors in both

axes are present provide correct results. Moreover results were best in

those simulated data sets with homoscedastic uncertainties, followed by

those from data sets with constant and random heteroscedasticity.


85

The capability of the test to correctly detect lack of fit was also

checked. Table 2 presents the results of applying the lack of fit test to the

simulated data sets generated using the Monte Carlo method on the three

initial data sets with data pairs following non-linear patterns with different

kinds of uncertainties (i.e. showing an evident lack of fit to the regression

line, Figs. 1g-1i). Lack of fit was detected in a higher percentage than the set

level of significance α in the three cases (i.e. lack of fit was correctly

detected when it existed). Table 2 also shows that lack of fit was detected

most in those data sets with random and constant heteroscedasticity

respectively.

Application of the lack of fit test to real data sets under BLS conditions

Data Set 3. Results of applying the lack of fit test to this real data set

are reflected in Table 1. It can be seen that no lack of fit was detected under

BLS regression conditions, as the Fcal value is lower than the tabulated one

for 10 and 60 degrees of freedom and a level of significance of 5%. This

conclusion seems to be consistent with the data structure observed in the

weighted residual plot (Fig. 5a), in which the dispersion of the data appears

to be moderate except for the 6th data pair. As Figure 2c shows, the high

residual for the 6th data pair is because it is furthest from the regression

line, although it is the most imprecise data pair (i.e. the one with highest

individual uncertainties). The weighted residual for this data pair in Figure

5a appears between the warning and the action limits and thus, a test for

detecting outliers is necessary here.

Data Set 4. In this example lack of fit was detected considering the

uncertainties in both axes, because the Fcal value is higher than the

tabulated one for 31 and 165 degrees of freedom and a level of significance

of 5% (Table 1). The weighted residual plot (Fig. 5b), suggests that there is a

non-linear trend at the higher concentration levels. This could not be


86

observed in the plot of the BLS regression line (Fig. 2d) as the large number

of data pairs with their respective uncertainties provides an unclear image,

typical in relatively large data sets. This example shows the advantages of

using the weighted residual plot, particularly when working with medium

and large data sets. The possible non-linear pattern observed might

therefore cause the dispersion of the data observed in Fig. 2d which causes

the detection of lack of fit. From this conclusion the experimenter should

review the GC-ECD methodology being tested to search for the causes of

non linear responses at higher concentration levels.

88 90 92 94 96 98 10

0

Concentration Level

S3

S2

S3−

S2−

80 85 90 95 100

0

S3

S2

S3−

S2−

Concentration Level

Figure 5. Weighted residual plots under BLS regression conditions for data set 3 (a) and data set 4 (b).

CONCLUSIONS

In this paper we have proposed and validated a statistical test

which detects lack of fit of the BLS regression line to data with errors in

both axes, based on the analysis of the variance (Anova) of the residuals.

When the uncertainty structure in both axes considered by the BLS

technique is similar to the one assumed by OLS and WLS, conclusions

about lack of fit were similar in all three cases. The test has also provided

correct results when detecting lack of fit in simulated data.

a) b)


87

The fact that BLS requires replicate measurements of each sample to

perform the lack of fit test does not represent an additional analytical effort

since the BLS method needs replicate measurement data to find the

coefficients of the regression line. This is not so for OLS which needs an

additional analytical effort to be made to provide the replicate

measurement values so that the analogous test can be applied. Despite the

fact that our work suggests that the lack of fit test is suitable when data

have errors in both axes, we recommend that the plot of the weighted

residuals be used to complement the statistical analysis. In this way, the

data structure can be visualised and the reasons for a hypothetical lack of

fit explained

Finally, if lack of fit is detected in the experimental data, the linear

model may not be correct and the experimental error estimate s2 will no

longer provide an accurate estimate of the random error present in the

experimental data. If lack of fit is caused by the existence of outliers, these

experimental points should be tested and removed if necessary. Depending

on the magnitude of the individual uncertainties, lack of fit might also be

caused by different degrees of dispersion of experimental data pairs

around the regression line. In such cases, if no outliers are identified, the

analytical methodology should be revised to search for unexpected

measurement errors in those problematic data pairs, being necessary to

perform new analyses for each sample. Special attention should be paid to

the presence of data pairs with abnormally low individual uncertainties,

since very slight deviations from the BLS regression line may cause lack of

fit to be detected in the data set. If, however, lack of fit is due to non-linear

data and the causes for non-linear responses are neither known nor

identified, polynomial regression considering errors in both axes [32]

should be considered.


88

ACKNOWLEDGMENTS

The authors thank the DGICyT (project no. BP96-1008) for financial

support, and the University Rovira i Virgili for providing a doctoral

fellowship to A. Martínez.

BIBLIOGRAPHY

1.- K. Govindaraju, I. Roelandts, 1988 Compilation report on trace elements

in Six ANRT rock Reference Samples: Diorite DR-N, Serpentine UB-N,

Bauxite BX-N, Disthene DT-N, Grsnite GS-N and Potash Feldespar FK-N,

Geostandards Newsletter, 13 (1989) 1 5-67.

2.- J. Riu, F.X. Rius, Assessing the accuracy of analytical methods using

linear regression with errors in both axes, Anal. Chem. 68 (1996) 1851-1857.

3.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York,

1987.

4.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van

Nostrand Reinhold, New York, 1987.

5.- M.A. Creasy, Confidence limits for the gradient in linear in the linear

functional relationship, J. Roy. Stat. Soc. B 18 (1956) 65-69.

6.- J. Mandel, Fitting straight lines when both variables are subject to error,

J. Qual. Tech. 16 (1984) 1-14.

7.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, D.L. Massart,

Detection of bias in method comparison by regression analysis, Anal. Chim.

Acta 338 (1997) 19-40.

8.- J.M. Lisý, A. Cholvadová, J. Kutej, Multiple straight-line least-squares

analysis with uncertainties in all variables, Comput. Chem. 14 (1990) 189-

192.

9.- J. Riu, F.X. Rius, Univariate regression models with errors in both axes, J.

Chemom. 9 (1995) 343-362.


89

10.- G.J. Hahn, W. Q. Meeker. Statistical Intervals, a guide for practitioners,

John Wiley & Sons, New York, 1991.

11.- A. Martínez, J. del Río, J. Riu and F. X. Rius, Chemometrics Intell. Lab.

Syst. 49 (1999) 179-193.

12.- A. Martínez, J. Riu and F. X. Rius submitted for publication.

13.- Analytical Methods Committee, Is my calibration linear?, Analyst 119

(1994) 2363-2366.

14.- Draper, N.; Smith, H. Applied Regression Analysis, 2nd ed.: John Wiley &

Sons: New York, 1981;pp 5-128.

15.- P. C. Meier and R. E. Zund, Statistical Methods in Analytical Chemistry,

John Wiley & Sons, New York, 145-150, 1993.

16.- O. Güell and J.A. Holcombe, Analytical applications of Monte Carlo

techniques, Analytical Chemistry, 60 (1990) 529A - 542A.

17.- P. Sprent, Models in Regression and related topics, Methuen & Co. Ltd.:

London, 1969.

18.- D. W. Schafer and K. G. Puddy, Likelihood analysis for errors-in-

variables regression with replicate measurements, Biometrika 83 (1996) 813-

824.

19.- K. C. Lai and T. K. Mak, Maximum likelihood estimation of a linear

structural relationship with replication, J. R. Statist. Soc. B 41 (1979) 263-268.

20.- C. L. Cheng and J. W. van Ness, On estimating linear relationships

when both variables are subject to error, J. R. Statist. Soc. B 56 (1994) 167-

183.

21.- D. V. Lindley, Regression lines and the linear functional relationship, J.

R. Statist. Soc./ London Suppl. Series B, 9 (1947) 218-244.

22.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons: New York,

1977; pp. 160-211.

23.- J. H. Williamson, Least-squares fitting of a straight line, Can. J. Phys. 46

(1968) 1845-1847.

24.- P. Bentler and D. G. Bonett, Significance tests and goodness of fit in the

analysis of covariance structures, Psychological Bulletin 88 (1980) 588-606.


90

25.- J. del Río, J. Riu, F. X. Rius, in preparation.

26.- A. M. Mood, F. A. Garybill, Introduction to the Theory of Statistics,

McGraw-Hill, New York (1963).

27.- P. de Vogt, J. Hinschberger, E. A. Maier, B. Griepink, H. Muntau, J.

Jacob, Improvements in the determination of eight polycyclic aromatic

hydrocarbons through a stepwise interlaboratory study approach,

Fresenius J. Anal. Chem., 356 (1996) 41-48.

28.- A. Sannino, P. Mambriani, M. Bandini and Luciana Bolzoni,

Multiresidue method for determination of organochlorine insecticides and

polychlorinated biphenyl congeners in fatty processed foods, J. AOAC Int.

79 (1996) 1434-1446.

29.- G. F. Pang, Y. Z. Chao, C. L. Fan, J. J. Zhang and X. M. Li, Modification

of AOAC multiresidue method for determination of synthetic pyrethroid

residues, vegetables, and grains. Part I: Acetonitrile extraction system and

optimization of florisil cleanup and gas chromatography, J. AOAC Int. 78

(1995) 1481-1488.

30.- G. F. Pang, Y. Z. Chao, C. L. Fan, J. J. Zhang, X. M. Li and Y. M. Liu,

Modification of AOAC multiresidue method for determination of synthetic

pyrethroid residues, vegetables, and grains. Part II: Acetone extraction

system, J. AOAC Int. 78 (1995) 1489-1495.

31.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.

Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics:

Part A, Elsevier, Amsterdam, 1997.

32.- J.M. Lisý, A. Cholvadová and B. Dbroná, Polynomial (linear in

parameters) least squares analysis when all experimental data are subject to

random errors, Comput. Chem. 15 (1991) 135-141.

2.4 Conclusions

91

2.4 Conclusions

Deixades a part les conclusions extretes de l’article presentat en

l’apartat 2.3, se’n poden extreure d’altres relacionades amb els resultats

obtinguts amb dades simulades. S’ha pogut comprovar la necessitat de

realitzar prou rèpliques per poder estimar de forma correcta tant els valors

experimentals ( ix , iy ) com les seves variàncies individuals ( 2ixs , 2

iys ).

Aquests resultats són poc aplicables sota condicions d’anàlisi reals, ja que

fer un nombre de rèpliques elevat per mostra augmentaria dràsticament

tant el temps com el cost de l’anàlisi. Tot i això, aquests resultats demostren

de forma clara les virtuts i les limitacions dels dos tests estudiats per

detectar la falta d’ajust dels punts experimentals a les rectes de regressió

obtingudes pels mètodes WLS o BLS. Conèixer aquestes limitacions és

fonamental a l’hora d’establir el disseny experimental, ja que de vegades

pot ser convenient reduir el nombre de mostres de diferents concentracions

que s’han d’analitzar i augmentar-ne el nombre de rèpliques.

D’altra banda, també s’ha demostrat que quan el nombre de

rèpliques realitzades per estimar els valors experimentals i les seves

incerteses individuals és limitat, la capacitat del test F sota condicions de

regressió BLS per detectar correctament falta d’ajust és superior a la

mostrada pel test χ2 . Una altre fet a destacar és que l’aplicació del test F en

el mètode de regressió BLS, a diferència del mètode OLS, no comporta un

major esforç experimental, ja que utilitza els valors de les incerteses

individuals per estimar els coeficients de regressió. No obstant això, com es

veurà en l’apartat 5.4 del cinquè capítol, hi ha casos en què no és possible

conèixer els valors de les rèpliques a partir de les quals s’estimen els valors

dels punts ( ix , iy ) i de les seves incerteses individuals. En aquests casos

l’única possibilitat per detectar la falta d’ajust dels punts experimentals a la

recta de regressió BLS serà mitjançant el test χ2, tot i ser menys rigorós que

el test F.


92

2.5 Referències



2.- Draper N., Smith H., Applied Regression Analysis, 2nd ed., John Wiley &

Sons: New York, 1981.

3.- Vakeerberghen P., Smeyers-Verbeke J., Chemometrics and Intelligent

Laboratory Systems, 15 (1992) 195-202.




5.- Williamson J.H., Canadian Journal of Physics, 46 (1968) 1845-1847.

6.- Analytical Methods Committee, Analyst, 113 (1988) 1469.

7.- Miller J.N., Spectroscopy International, 3 (1991) 41-43.

8.- Hunter J.S., Journal Association of Official Analytical Chemists, 64 (1981)

574.


10.- Bentler P., Bonett D. G., Psychological Bulletin, 88 (1980) 588-606.

11.- Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John

Wiley & Sons: New York, 1993.

12.- Güell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A..

CAPÍTOL 3

Probabilitat d’error de primera i segona espècie en els tests individuals sobre l’ordenada a

l’origen i el pendent en regressió lineal considerant errors en els dos eixos


95


Després d’haver comprovat la importància de la detecció de la falta

d’ajust dels punts experimentals a la recta de regressió obtinguda pel

mètode BLS en el capítol anterior, en aquest capítol ens centrarem en

diferents aspectes dels tests estadístics basats en els intervals de confiança

individuals aplicats sobre els coeficients de regressió BLS. En el cas dels

estudis de comparació de metodologies analítiques, aquests tests permeten

detectar la presència d’errors constants o proporcionals en els resultats del

nou mètode en comparació als resultats del mètode de referència. D’altra

banda, en el cas de la calibració lineal, l’aplicació dels intervals de confiança

individuals sobre l’ordenada a l’origen i el pendent permet determinar la

necessitat de correccions del blanc (comprovar si l’ordenada a l’origen de la

recta de regressió és significativament diferent d’un valor establert) o

l’eficàcia en processos de recuperació (mitjançant l’interval de confiança

individual pel pendent).

Un dels aspectes més importants a tenir en compte a l’hora d’aplicar

un test individual sobre un dels coeficients de regressió BLS, és la

possibilitat de cometre un error de tipus β. Com ja s’ha comentat en

l’apartat 1.5.3 de la Introducció, segons el problema analític tractat, la seva

importància pot ser molt gran. Una conseqüència de tenir una probabilitat

d’error β elevada en el cas d’aplicar tests individuals en la comparació de

metodologies analítiques, podria ser l’acceptació d’una nova metodologia

analítica que dóna resultats amb errors proporcionals o constants

significatius respecte als del mètode de referència. En processos de

calibració, una probabilitat d’error β elevada podria portar a no aplicar

correccions de blanc quan en realitat serien necessàries, o a no detectar que

el nivell de recuperació és significativament diferent a un valor determinat.

Per aquests motius en els apartats 3.2 i 3.4 es presenten les expressions

necessàries per estimar la probabilitat d’error β en l’aplicació de tests

Capítol 3. Probabilitat d’error de primera i segona espècie ...

96

individuals sobre els coeficients de regressió calculats pel mètode BLS.

D’altra banda, en l’apartat 3.3 es presenta un procediment per determinar

la quantitat de mostres necessàries per construir la recta de calibrat per

tenir unes probabilitats fixades de cometre errors α i β a l’hora de detectar

un biaix (∆) determinat en els coeficients de regressió BLS.

A l’apartat 3.4 es troba el gruix del treball tractat en aquest capítol,

com a part de l’article Detecting proportional and constant bias in method

comparison studies by using linear regression with errors in both axes, publicat

en la revista Chemometrics and Intelligent Laboratory Systems. Finalment, en

l’apartat 3.5 es presenten les conclusions extretes d’aquest capítol.

3.2 Estimació de la probabilitat d’error de segona espècie en

l’aplicació de tests individuals sobre els coeficients de

regressió

Per poder aplicar els tests individuals basats en la distribució t de

Student sobre els coeficients de regressió obtinguts pel mètode BLS, cal

comprovar que segueixen una distribució normal. És conegut, però, que els

coeficients de regressió estimats pel mètode BLS no segueixen una

distribució normal.1 Per aquest motiu en la secció Background and Theory de

l’apartat 3.4 s’estudia el grau de desviació de la normalitat d’aquests

coeficients de regressió. Com es demostra en aquest estudi, l’error comès

per no considerar les incerteses en la mesura de la variable predictora (cas

dels mètodes OLS i WLS, en què els coeficients de regressió sí que

segueixen una distribució normal) és major que el comès en assumir la

normalitat de les distribucions dels coeficients de regressió BLS. En la

secció Results and Discussion de l'apartat 3.4 es pot comprovar com el grau

de desviació de normalitat comès pels coeficients de regressió estimats

segons el mètode BLS és prou baix com per acceptar la hipòtesi que aquests

coeficients de regressió es distribueixen normalment. Aquests resultats

3.2 Estimació de la probabilitat d’error ...

97

justifiquen el desenvolupament de tests individuals per als coeficients de

regressió BLS sota l’assumpció de normalitat.

Com que aquests tests individuals són uns tests d’hipòtesi,

inicialment és necessari fixar els valors dels coeficients de regressió teòrics

a partir dels quals es postulen les hipòtesis nul·la (H0) i alternativa (H1) amb

els quals es compararan les estimacions dels coeficients de regressió, tal

com es descriu a l’apartat 1.5 de la Introducció. Al llarg d’aquest treball s’ha

considerat que les distribucions seguides pels valors teòrics en el cas del

pendent (0H1b i

1H1b ) presenten desviacions estàndard diferents a la dels

coeficients de regressió estimats. Això és degut al fet que a diferència dels

mètodes OLS i WLS, en el mètode de regressió BLS la desviació estàndard

del pendent depèn directament del factor de ponderació 2ies (vegeu

equacions 8 i 9 a la secció Background and Theory en l’apartat 3.4), que a la

vegada també depèn directament del valor del pendent (eq. 1.36). Per tant,

un valor del pendent més gran tindrà associat una distribució amb una

desviació estàndard més elevada i viceversa. Cal destacar que aquesta

distribució és una t de Student, ja que el nombre de punts de la recta de

regressió sol ser baix.2 L’interval de confiança es construeix al voltant del

valor a partir del qual es postula la hipòtesi nul·la per un nivell de

significança α, segons l’expressió:

0H00H 2,2α0 bn stb ⋅± − (3.1)

on 0H0b és el valor teòric de l’ordenada a l’origen pel qual es postula H0 (en

aquest cas fixat a 0) i 0H0bs és la seva desviació estàndard, que és igual a

l’estimada per l’ordenada a l’origen (0bs ), ja que el factor de ponderació 2

ies

és independent del valor de l’ordenada a l’origen. Per al pendent

l’expressió anàloga és:


98

0H10H 2,2α1 bn stb ⋅± − (3.2)

En aquest cas el valor de 0H1b pel qual es postula H0 és 1. Si el valor del

coeficient de regressió estimat cau dins l’interval de confiança

corresponent, s’accepta H0 i es rebutja H1. En aquest cas existeix la

possibilitat de cometre un error de tipus β, ja que s’està acceptant H0 quan

la hipòtesi correcta pot ser en realitat H1. Aquesta situació es representa

gràficament com l’àrea de la distribució corresponent a H1 intersectada dins

de l’interval de confiança al voltant del valor de referència considerat per

postular H0:

∆ (biaix)

Probabilitatsd’error β

Probabilitatsd’error α/2

1H

1H

111

1

:H

1

bb

b

=

∆−=


0H

0H

110

1

:H

1

bb

b

=

=

0H1bs1H1bs

1b

0H12α bst ⋅1H1β bst ⋅

Figura 3.1. Representació de les probabilitats d’error α i β en l’aplicació del test individual per al pendent considerant diferents distribucions per a H0 i H1. La probabilitat d’error β es calcula a partir d’un nivell de significança α fixat.

Aquesta àrea simbolitza la probabilitat que un coeficient de regressió amb

un valor igual al postulat per H1 (amb un biaix ∆ prèviament definit per

l’experimentador) pogués ser erròniament considerat igual al valor de

referència establert per H0, per un nivell de significança α, a causa dels

3.2 Estimació de la probabilitat d’error ...

99

errors aleatoris comesos en les mesures experimentals. Aquesta probabilitat

es pot estimar a partir de les equacions 6 i 7 presentades en la secció

Background and Theory en l’apartat 3.4.

Una altra manera d’estimar les probabilitats de cometre un error β

és considerant el nivell de significança màxim pel qual s’acceptaria H0

(sempre que aquest valor fos acceptable per l’analista, p. e. superior al 5%)

en comptes del valor α fixat inicialment. D’aquesta manera, les

probabilitats d’error β serien més baixes per a aquelles estimacions dels

coeficients de regressió més semblants al valor de referència establert per

H0, ja que són les que tenen una major probabilitat de pertànyer realment a

la distribució associada a H0. La figura següent mostra aquesta segona

aproximació per estimar les probabilitats d’error β per tests individuals

sobre el pendent.

∆ (biaix)

Probabilitatsd’error β


1H

1H

111

1

:H

1

bb

b

=

∆−=


0H

0H

110

1

:H

1

bb

b

=

=

0H1bs1H1bs

1b

1H1β bst ⋅0H12α bst ⋅

Figura 3.2. Representació de les probabilitats d’error α i β en l’aplicació del test individual per al pendent considerant diferents distribucions per a H0 i H1. La probabilitat d’error β es calcula a partir del nivell de significança α màxim pel qual no es detecta biaix.


100

Cal tenir en compte que en aquells casos en què el nombre de

mostres de calibrat és baix, les estimacions de l’error experimental s2

tendeixen a estar sobreestimades.3 Això provoca que els valors de les

desviacions estàndard dels coeficients de regressió BLS (eqs. 8 i 9 en

l’apartat 3.4) i, per tant, les distribucions associades tant a H0 com a H1 (en

cas d’aplicar el test individual sobre el pendent) siguin més grans del que

haurien de ser. En aquest cas, s’obtindrà una sobreestimació de la

probabilitat d’error β. Per compensar aquest efecte, atesa la dependència

directa entre les desviacions estàndard i el pendent de la recta BLS, és

recomanable considerar valors esbiaixats de 1H1b inferiors a 1. Això farà que

el valor de la desviació estàndard pel pendent 1H1bs i, per tant, la distribució

associada a H1 sigui més baixa, la qual cosa farà l’estimació de la

probabilitat d’error β més acurada.

3.3 Relació entre les probabilitats d’error de primera i segona

espècie amb el nombre de mostres de calibració

Degut a les conseqüències que es poden generar al cometre errors

de primera o segona espècie en els tests individuals sobre l’ordenada a

l’origen i el pendent de la recta de regressió, controlar les probabilitats

d’error α i β pot ser interessant segons el problema analític tractat. Això és

possible gràcies a la relació existent entre les probabilitats d’error α i β, el

biaix ∆ a detectar en el coeficient de regressió estimat i el nombre de

mostres utilitzades per construir la recta de regressió. En el cas del mètode

OLS, hi ha una expressió que relaciona aquestes variables:4

22 )(

∆

+=

sttn βα (3.3)

3.3 Relació entre les probabilitats d’error ...

101

Aquesta expressió dóna el mínim nombre de mostres necessàries per poder

detectar un biaix ∆ en l’estimació del coeficient de regressió, amb unes

probabilitats de cometre errors α i β determinades. El terme s correspon a

l’error experimental en unitats de desviació estàndard. Aquest valor és

únic, ja que al contrari del mètode BLS, per OLS l’error experimental s2 no

canvia en funció dels valors teòrics del pendent que s’escullin per establir

H0 i H1. Per aquest motiu, les distribucions associades a H0 i H1 pel mètode

OLS són iguals.

En el mètode de regressió BLS, l’estimació del nombre de punts de

la recta de regressió necessaris per detectar un biaix ∆ en el coeficient de

regressió corresponent amb unes probabilitats de cometre errors α i β

determinades, és més complicada. La predicció del nombre de mostres es fa

a partir de les desviacions estàndard de l’ordenada o el pendent, seguint les

expressions 6 i 7 de la secció Background and Theory de l’apartat 3.4. Per

poder relacionar el nombre de punts necessaris per construir la recta de

regressió amb les desviacions estàndard de l’ordenada i el pendent, és

necessari descompondre aquestes expressions (eqs. 8 i 9 de l’apartat 3.4) en

diferents factors. Aquesta descomposició, però, només és possible si es

considera que els factors de ponderació 2ies són constants, és a dir, si es

considera que les incerteses generades pels errors comesos en la mesura

experimental són constants (homoscedasticitat).

Per estimar el nombre de punts necessaris per construir la recta de

regressió s’utilitza un procediment iteratiu esquematitzat en la figura

següent.


102

Method 1

Met

hod

2βα tt ,2

niter itern′

?iteriter nn =′ SíNo

2s

∑=

i

i

n

i

i

sx

12ε

iternn ′=iteriter nn ′=

Afegir punts fins que

niter= niter0

Figura 3.3. Esquema del procediment iteratiu per estimar el mínim nombre de mostres necessàries per detectar un biaix ∆ en el coeficient de regressió estimat mitjançant el mètode BLS, amb unes probabilitats de cometre errors α i β determinades.

Les millors estimacions possibles tant de l’error experimental s2 com dels

sumatoris de la variable predictora (eqs. 12 i 13 de l’apartat 3.4), en cas que

no se’n tingui cap coneixement previ, s’obtenen a partir d’un conjunt de

dades inicial. Amb les estimacions inicials d’aquestes variables, es pot

calcular el nombre de punts de la recta de regressió mitjançant un

procediment iteratiu, ja que les variables tα/2 i t β a les equacions 12 i 13 de

l’apartat 3.4 depenen del nombre de punts que es vol estimar. El pas

següent consisteix a mesurar els nous valors experimentals fins a arribar a

tenir el nombre de punts estimats per construir la recta de regressió BLS.

Segons es vagin afegint punts experimentals, les estimacions de l’error

experimental i dels sumatoris de la variable predictora seran més acurades.

En el moment que el nombre de punts estimat sigui igual al que ja té el

conjunt de dades, el procés haurà finalitzat. Cal destacar que aquest

procediment és molt sensible a les estimacions inicials de l’error

experimental i els sumatoris de la variable predictora. Segons es descriu a

la secció Results and Discussion de l’apartat 3.4, en les primeres etapes del

procediment, quan el nombre de punts és encara baix, l’estimació pot

arribar a ser negativa. Això és degut a que les estimacions de l’error

experimental i els sumatoris de la variable predictora no solen ser gaire

3.3 Relació entre les probabilitats d’error ...

103

correctes quan el nombre de punts en el conjunt de dades és petit. Tot i que

aquest procediment considera que les incerteses associades als valors

experimentals són homoscedàstiques, les estimacions del nombre de punts

necessaris per construir la recta de regressió BLS sota les condicions abans

esmentades són satisfactòries, fins i tot en conjunts de dades amb

heteroscedasticitat moderada (vegeu secció Results and Discussion en

l’apartat 3.4).


104

3.4 Detecting proportional and constant bias in method

comparison studies by using linear regression with errors in

both axes. (Chemometrics and Intelligent Laboratory Systems,

49 (1999) 179-193)

Àngel Martínez*, F. Javier del Río, Jordi Riu, F. Xavier Rius




ABSTRACT

Constant or proportional bias in method comparison studies using

linear regression can be detected by an individual test on the intercept or

the slope of the line regressed from the results of the two methods to be

compared. Since there are errors in both methods, a regression technique

that takes into account the individual errors in both axes (bivariate least

squares, BLS) should be used. In this paper we demonstrate that the errors

made in estimating the regression coefficients by the BLS method are fewer

than with the OLS or WLS regression techniques and that the coefficient

can be considered normally distributed. We also present expressions for

calculating the probability of committing a β error in individual tests under

BLS conditions and theoretical procedures for estimating the sample size in

order to obtain the desired probabilities of α and β errors made when

testing each of the BLS regression coefficients individually. Simulated data

were used for the validation process. Examples for the application of the

theoretical expressions developed are given using real data sets.

Chemom. Intell. Lab. Syst., 49 (1999) 179-193

105

INTRODUCTION

Linear regression is widely used in the validation of analytical

methodologies. In method comparison studies, for example, a set of

samples of different concentration levels are analysed by the two methods

to be compared, and the results are regressed on each other. Ordinary least-

squares (OLS), or weighted least-squares (WLS), which considers

heteroscedasticity in the response variable, are the most widely used

regression techniques. However, these techniques have a limited scope,

since they consider the x-axis to be free of error. OLS and WLS should not

usually be applied, for instance, in method comparison studies, since the

uncertainties associated with the methods to be compared are usually of

the same order of magnitude. An alternative is the errors-in-variables

regression [1], also called CVR approach [2-4], which considers the errors in

both axes. It does not take into account the individual uncertainties of each

experimental point but considers the ratio of the variances of the response

to predictor variables to be constant for every experimental point

(λ=sy2/sx2). A particular case of the CVR approach is the orthogonal

regression (OR) [5], in which the errors are of the same order of magnitude

in the response and predictor variable (i.e. λ=1). Another option is a

bivariate least squares (BLS) regression technique [6,7], which takes into

account individual non-constant errors in both axes to calculate the

regression coefficients.

Despite the recent development of a joint confidence interval test for

the BLS regression method [8], no statistical test to individually assess the

presence of bias in the regression coefficients which takes into account the

individual uncertainties in every experimental point has yet been

described. For this reason, we present expressions for the application of the

individual tests which take into account individual errors in both axes.

Although the distributions of the BLS slope and intercept have been


106

reported to be nongaussian [9], in this paper we show that the results of

applying statistical tests based on the assumption of normality of the BLS

regression coefficients do not show significant errors and that these errors

are fewer than those obtained with the OLS or WLS regression techniques.

Of the two types of error associated with the statistical tests (α and

β), the β error, related to the probability of not detecting an existing

proportional or constant bias is seldom considered. However, the

theoretical background and the expressions which enable its calculation in

the individual tests which use the OLS method have already been

developed [5]. In this paper we describe the expressions for estimating the

probability of β error when performing an individual test on one of the

regression coefficients to detect a set proportional or constant bias based on

the BLS regression technique. These expressions take into account the

different distributions that may be associated to the reference and to the

selected biased regression coefficient values. These estimates are compared

with the ones from the OLS and the WLS techniques for several real data

sets. Finally, we describe the procedure for estimating the sample size, i.e.

the number of experimental data pairs necessary for detecting the specific

selected bias when performing an individual test with set probabilities of

making α and β errors when the BLS regression method is used. Simulated

data sets have been used to validate the theoretical expressions.


Notation

In general, the true values of the different variables used in this

work are represented with greek characters, while their estimates are

denoted with latin letters. In this way, the true values of the BLS regression

coefficients are represented by β0 (intercept) and β1 (slope), while their


107

respective estimates are denoted as a and b. The estimates of the standard

deviation of the slope and the intercept for the BLS regression line, are

symbolised as sb and sa respectively. The experimental error, expressed in

terms of variance for the n experimental data pairs (xi,yi), is referred to as

σ2, while its estimate is s2. By analogy, iy represents the estimated value for

the yi predicted. The estimated variance-covariance matrix of the regression

coefficients related to the BLS regression technique is denoted as B.

In the individual tests, the terms 0Ha ,

1Ha , 0Hb and

1Hb represent

the values of the theoretical regression coefficients from which the null (H0)

and the alternative hypothesis (H1) are assumed. The distance between 0Ha

and 1Ha or between

0Hb and 1Hb , known as bias, is denoted by ∆ and

represents the value of the systematic error that the experimenter wants to

check. By analogy, the values of the standard deviations of the theoretical

regression coefficients defining H0 and H1 are denoted as 0Has

(or

0Hbs ) and

1Has (or 1Hbs ).


BLS is the generic name given to a set of regression techniques

applied to data which contain errors in both axes. From all the different

existing approaches for calculating the regression coefficients, Lisý’s

method [6] was found to be the most suitable [7]. This technique assumes

the true linear model to be:

ii ξββη 10 += (1)

The true variables ξi and ηi are unobservable and instead, one can only

observe the experimental variables:


108

iiix δξ += and iiiy γη += (2)

Variables δi and γi are random errors committed in the measurement of

variables xi and yi respectively, where ),0(N~ 2ixi σδ and ),0(N~ 2

iyi σγ . In

this way, the observed variables xi and yi are related as follows:

iii bxay ε++= (3)

where εi is the ith residual error. The BLS regression method finds the

estimates of the regression line coefficients by minimising the sum of the

weighted residuals, S, expressed in eq. (4):

2

12

2

)2()ˆ( sns

yySn

i

ii

i

−=−

= ∑= ε

(4)

The weighting factor 2

isε is expressed as the variance of the ith residual iε

and takes into consideration the variances of any individual point in both

axes ( 2ixs and 2

iys ) obtained from the replicate analysis of each sample by

both methods. The covariance between the variables for each (xi,yi) data

pair, which is normally assumed to be zero, is also taken into account:

),(cov2)var()(var 2222

iixyiii yxbsbssbxayiii

−+==−−= εε

(5)

For this reason, the BLS regression technique assigns higher weights

to those data pairs with larger 2ixs and 2

iys values, i.e. the most imprecise

data pairs. By minimising the sum of the weighted residuals (eq. (4)), two

non-linear equations are obtained, from which the regression coefficients a

and b can be estimated by an iterative process [8].


109

Characterisation of the distribution of the BLS regression coefficients

The distribution functions of the regression coefficients a and b

found by the BLS regression technique have been reported to be

nongaussian [9]. This influences the individual tests on the regression

coefficients, since they are usually performed under the assumption of

normality. To determine the degree of non-normality of the distributions of

the BLS coefficients, three different statistical tests were used: Cetama [10]

(which also allows the actual probability function to be characterised), the

Kolmogorov test [11] and the normal probability plot (or Rankit test) [12].

These tests were applied to different types of real data sets to find a

relationship between their structure and the degree of non-normality.

Furthermore, to characterise their distribution, the real distributions and

some theoretical distributions were compared. These comparisons were

carried out with the quantile-quantile graphic method (Q-Q plot) [12].

β error in the individual tests for the BLS regression coefficients

According to the theory of hypothesis testing, when an individual

test is applied on a regression coefficient, the null hypothesis H0 is usually

defined as the one that considers the estimated regression coefficient to

belong to the distribution of a hypothetical regression coefficient (0Ha or

0Hb ) equal to the reference value, or in other words, that there are no

proportional or constant systematic errors in the method being tested. On

the other hand, the alternative hypothesis H1 considers that the estimated

regression coefficient belongs to the distribution of a hypothetical

regression coefficient (1Ha or

1Hb ) with a given value. This value, which has

to be set by the experimenter according to the systematic error one wants to

detect in the analytical method being tested, defines the distance between

0Ha (or 0Hb ) and

1Ha (or 1Hb ), or in other words the so-called bias [13]. The


110

standard deviations 0Has

(or

0Hbs ) and 1Has (or

1Hbs ) can be calculated for a

given data set with the values of 0Ha (or

0Hb ) and 1Ha (or

1Hb ).

The expressions developed for estimating the probability of

committing a β error in the application of an individual test to one of the

regression coefficients calculated by using the OLS regression technique are

established [5]. Analogous expressions can be adapted for the BLS

technique by considering the appropriate standard deviation values:

1H0H2 bbb stst ⋅+⋅=∆ βα 1H

0H2

b

bb

sst

t⋅−∆

=α

β

(6)

1H0H2 aaa stst ⋅+⋅=∆ βα 1H

0H2

a

aa

sst

t⋅−∆

=α

β (7)

The probability of committing a β error under the assumption of normality

is finally given by the Student’s t value for n-2 degrees of freedom for a

fixed level of significance α. The standard deviations 0Has

(or

0Hbs ) and

1Has (or 1Hbs ) can be estimated in a similar way to the standard deviations of

the intercept and the slope, and are easily obtained from the B variance-

covariance matrix [8] calculated while estimating the regression coefficients

with the BLS technique:

s

sx

sx

s

sx

sn

i

in

i

n

i

i

n

i

i

a

iii

i ×

−×

=

∑∑ ∑

∑

== =

=

2

12

1 12

2

2

12

2

1

εεε

ε

(8)


111

s

sx

sx

s

ss

n

i

in

i

n

i

i

n

ib

iii

i ×

−×

=

∑∑ ∑

∑

== =

=

2

12

1 12

2

2

12

1

1

εεε

ε

(9)

To calculate the values of 0Has

(or

0Hbs ) and 1Has (or

1Hbs ) it is only

necessary to recalculate the value of the weighting factor (eq. (5)) according

to the new slope value. Due to the dependence of the weighting factor on

the slope, the values of 0Has and

1Has will be equal to the standard

deviation obtained for the estimated regression coefficient (1H0H aaa sss == ),

which is not true for the slope. The experimental error s2 remains

unchanged.

Estimating the sample size

Relating eqs. (8-9) with the number of data pairs n it is possible to

estimate the number of data pairs required to detect certain bias with set

probabilities of committing α and β errors. This can only be achieved if the

individual uncertainties, and hence the weighting factors are considered

constant for all the data pairs ( 2

0Hasε , 2

0Hbsε or 2

1Hbsε = ct):

sxxn

sxs

n

ii

n

iia

n

ii

a

a

⋅

−⋅

⋅=

∑∑

∑

==

=2

11

2

1

22

0H

0H

ε

(10)


112

sxxn

sns

n

ii

n

iib

b

bb ⋅

−⋅

⋅=

∑∑==

2

11

2

2

0H

0H

ε

or sxxn

sns

n

ii

n

iib

b

bb ⋅

−⋅

⋅=

∑∑==

2

11

2

2

1H

1H

ε

(11)

Introducing these two expressions in eq. (6-7) respectively it is possible to

isolate n in terms of the desired variables α, β and ∆:

∑

∑

=

=

+⋅∆

⋅+= n

ii

n

ii

aa

x

xs

sttn a

1

2

2

122

222/

0H)( εβα

(12)

222/

1

22

2

1

2

)(1H0

sststx

xn

bHb

n

iib

n

iib

b

⋅⋅+⋅−⋅∆

⋅∆

=∑

∑

=

=

εβεα

(13)

Initial estimates of the terms 2

0Hasε

or 2

0Hbsε and 2

1Hbsε , s2 and both

sums involving x data coordinates can be set from an initial data set

containing few data pairs. After an iterative calculation (due to the

dependence of the tα/2 and tβ values on the number of data pairs) an

estimate of na or nb is obtained. It is then important to recalculate the

sample size adding more data to the initial data set, as the estimates of the

terms mentioned in eqs. (12-13) are likely to change. In this way a new

estimate of na or nb is obtained. The estimation process ends when the

differences between two consecutive na or nb values are below a set

threshold value.


113

Validation

The objective of the validation process is twofold. Firstly, to show

that, despite the non-normal distribution of the BLS regression line

coefficients, the confidence interval computed using the t-distribution can

generally be accepted without committing relevant errors. Secondly, to

assess whether the theoretical estimate of either the β error and the number

of data pairs required to perform the individual tests, based on BLS under

defined statistical conditions, provides correct results.

To show the degree of non-normality of the intercept and the slope

distributions under real regression conditions, six real data sets with errors

in both axes were studied. The Monte Carlo method [14] was applied to

generate 200,000 data sets from each of the six initial ones (Figure 1).

MonteCarlo

n straightlines

1

······

2

3

n

Tests of normalityInitial data set

n

n

ba

2

2

ba

3

3

ba

1

1

ba

a b

····

Figure 1. Scheme of the procedure followed to check the normality of the BLS regression coefficients using the Monte Carlo simulation method and the three selected test for checking the normality.

This method adds a random error to every data pair based on the

individual uncertainties in both axes. In this way, 200,000 simulated data


114

sets were randomly generated. This gave rise to 200,000 regression lines, to

which the three selected tests for assessing the normality of the

distributions were applied. The error made in estimating the BLS

regression coefficients when their respective distributions were assumed to

be normal (when in fact they are not) was quantified and compared with

the error made in estimating the regression coefficients by OLS and WLS

techniques. Figure 2 illustrates the comparison procedure. Once the

distribution of the regression coefficients corresponding to the real data set

is obtained by the Cetama method, we can determine its left (xlr) and right

(xrr) limits for a chosen level of significance α. The shaded areas in Figure 2

represent the errors made by estimating the regression coefficients with

each of the three regression techniques studied.

Real distribution

BLS

WLS

OLS

xlr xrr

xlbls xrbls

xlwls xrwls

xlols xrols

Figure 2. Error made in estimating the BLS regression coefficients assuming normal distributions. Comparison with errors made using OLS and WLS regression techniques.


115

To validate the expressions for the estimation of the probability of β

error, 24 initial simulated data sets were used with all the data pairs

perfectly fit to an straight line with either biased slope or intercept values.

From each of these initial data sets, 100,000 simulated new ones were

randomly generated by adding a random error to every individual data

pair (xi,yi) in the initial data set with the Monte Carlo method. An

individual test was then applied on one of the regression coefficients for

every one of these 100,000 data sets to check whether H0 could be accepted

in each case for a fixed level of significance α. So every time H0 was

accepted, a β error was being committed because the data set had been

generated from an initial biased one, but due to the application of random

errors by the Monte Carlo method, however, the bias could not be detected.

The value of the bias was chosen to provide a probability of β error similar

to the level of significance α in each of the four cases. In this way, if the

estimate of the probability of β error from the theoretical expressions was

similar to the one from the simulation process, we may conclude that the

stated expressions provide correct results.

Once the estimates of the probability of β error were proved to be

correct, the expressions to estimate the sample size were validated. The

probabilities of β error estimated for the different levels of significance α,

the calculated standard deviations and the experimental error from the

iterative process (terms tβ , tα/2 , 2

0Hasε ,

0Hbsε or

1Hbsε and s2 respectively) for

each of the initial data sets in the validation process were introduced in

expressions 12 and 13. If the estimated sample size required to achieve the

chosen probabilities of α and β error was similar to the number of data

pairs in each data set, results were considered correct. To show the

applicability of the procedure, a real data set was used as a case study.


116



Six real data sets with different characteristics (such as number of

data pairs, heteroscedasticity or position within the experimental domain)

were used to check the distribution of the BLS regression coefficients.

Twenty-four different simulated data sets were considered to validate the

expressions for the estimates of the probability of β error (eqs. (6-7)).

Finally, one of the six former real data sets was used to show the different

estimates of the probability of β error between BLS, OLS and WLS

regression techniques and provide an example of the sample size

estimation procedure using data with errors in both axes.

Data Set 1 [15]. Data set obtained from the study of the supercritical

fluid extraction (SFE) recoveries of policyclic aromatic hydrocarbons

(PAHs) from railroad bed soil using two different modifiers; CO2 (on the x-

axis) and a mixture of CO2 with 10% of toluene (on the y-axis). The data set

is composed of seven data pairs. The standard deviations (ixs and

iys )

were the result of a triplicate supercritical fluid extraction at each level of

concentration. The units are expressed in terms of µg/g of soil. The data set

and the regression lines obtained by the OLS, WLS and BLS regression

techniques are shown in Figure 3a.


117

-200 0 200 400 600 800 1000-200

0

200

400

600

800

1000

1 amalgamation

2 am

alga

mat

ions

(b)

0 5 10 15 20 25 300

5

10

15

20

25

30

CO2

CO

2 / 1

0% to

luen

e

(a)

0 20 40 60 80 100 1200

20

40

60

80

100

120

140

AAS

SIA

(c)

-5 0 5 10 15 20 25-5

0

5

10

15

20

25

AAS / selective reduction

AES

/ co

ld tr

appi

ng

(d)

40 60 80 100 120 140 16010

20

30

40

50

60

70

(e)

∆ kPa

∆ m

V

80 85 90 95 100 105 110 115 120 12550

100

150

200

solvent

solv

ent /

soil

(f)

Figure 3. OLS (dashed line), WLS (dotted line) and BLS (solid line) regression lines obtained for the six real data sets.

Data Set 2 [16]. Comparative study of mercury determination using

gas chromatography coupled to a cold vapour atomic fluorescence

spectrometer following derivatization with sodium tetraethylborate. One

(x-axis) and two (y-axis) amalgamation steps were used to obtain five data

pairs with their respective uncertainties (ixs and

iys ) generated from six

replicates performed at each point. Units are expressed in terms of pg. of

recovered mercury. The data set and the regression lines generated by the

three regression techniques are shown in Figure 3b.

Data Set 3 [17]. Twenty-seven data pairs obtained from a method

comparison study which analysed Ca(II) in water by atomic absorption

spectroscopy (AAS), taken as the reference method (x-axis), and sequential

injection analysis (SIA), taken as the tested method (y-axis). The data set

and the regression lines generated by OLS, WLS and BLS regression


118

techniques are shown in Figure 3c. Units are expressed in mg/l. The

uncertainties associated with the AAS method were derived from the

analytical procedure, including the linear calibration step [18]. The

uncertainties of the SIA results were calculated with a multivariate

regression model and the PLS technique using the Unscrambler program

(Unscrambler-Ext, ver. 4.0, Camo A/S, Trondheim, Norway).

Data Set 4 [19]. Comparative study for determining arsenic in

natural waters from two techniques: continuous selective reduction and

atomic absorption spectrometry (AAS) as the reference method (x-axis) and

non-selective reduction, cold trapping and atomic emission spectrometry

(AES) as the tested method (y-axis). Thirty experimental data pairs were

obtained with three replicates per data pair. The units are expressed in

terms of µg/l. The data set and the regression lines obtained using all three

regression techniques are shown in Figure 3d.

Data Set 5 [20]. Data set obtained by measuring the CO2 Joule-

Thompson coefficient. The data was acquired from thermocouple-

measured voltage differences (∆mV, on the y-axis) as a function of pressure

increments (∆kPa, on the x-axis). Eleven equally-distributed data pairs

were obtained with estimated unity x-axis uncertainties. The y-axis

uncertainties were estimated to be between one and two units. The data set

and the three regression lines found by using the stated regression

techniques are shown in Figure 3e.

Data Set 6 [21]. Comparative study of the average recoveries for

organochlorine pesticides present in solvent (on the x-axis) or in

solvent/soil suspension (on the y-axis) after microwave-assisted extraction

(MAE) analysis. Twenty-one data pairs were used in the analysis. The

uncertainties were obtained from triplicate MAE analysis at each point. The


119

data set and the straight lines regressed by the three regression techniques

are shown in Figure 3f.

To validate the estimates of the probability of β error, twenty-four

different initial data sets showing different values of bias in the intercept or

in the slope were built to cover several analytical situations; different linear

ranges, number of data pairs and uncertainty patterns.

Linear Ranges: Two linear ranges were considered during validation,

a short one for values from 0 to 10 units, and a large one for values from 0

to 100 units.

Number of data pairs: Data sets containing five, fifteen, thirteen and a

hundred data pairs were selected. In all cases the data pairs were randomly

distributed throughout the two different linear ranges.

Uncertainties: Homoscedastic and heteroscedastic data sets were

considered. The homoscedastic data sets were comprised of data pairs with

constant standard deviations on both x and y values. In the short linear

ranges the standard deviations presented half unity values, whereas in the

large linear ranges they showed unity values. The heteroscedastic data sets

were divided into two other different types. On one hand those with

increasing standard deviations and on the other hand, those which

presented random standard deviations. In both cases however, the

standard deviation values were never higher than the 10% of each

individual xi and yi value.

For every one of the twenty four different simulated data sets, four

levels of significance α were considered: 10, 5, 1 and 0.1%. Depending on

the regression coefficient being tested and on the level of significance, the

slope (1Hb ) or the intercept value (

1Ha ) of the selected bias changed in such


120

a way that the probabilities of β error from the iterative process were

similar to the specified α values. In this way the accuracy of estimates of

different magnitudes from eqs. (6-7) was also tested.

All the computational work was performed with home-made




Distribution of the regression coefficients

The results of studying the distributions of the slope (b) and the

intercept (a) using the three tests to check normality are summarised in

Table 1. The variation in the number of iterations needed to achieve non-

normality can be used to identify the degree of normality. The more

iterations needed to achieve non-normality (if finally achieved) the more

normal the distribution is.

Kolmogorov

Cetama α=1% α=5% α=10% Rankit Plot

Data set Iterations a b a b a b a b a b 1 10.000 NSNL NSLRL N NN NN NN NN NN NN NN 30.000 NSNL NSNL NN NN NN NN NN NN NN NN 50.000 NSNL NSNL NN NN NN NN NN NN NN NN 100.000 NSNL NSNL NN NN NN NN NN NN NN NN 200.000 NSNL NSNL NN NN NN NN NN NN NN NN

2 10.000 N NSNL N N N N N N NN NN 30.000 N NSLRL N N N N N N N N 50.000 N NSNL N N N N N N N N 100.000 NSNL NSNL N N N N N N N N 200.000 NSLRL NSLL N N N N N N N N

3 10.000 NSNL NSLRL N N N N N N NN NN 30.000 NSNL NSLRL N N N N N N NN N

Table 1. Normality study results for the BLS regression coefficients.


121

Kolmogorov Cetama α=1% α=5% α=10% Rankit Plot

Data set Iterations a b a b a b a b a b 50.000 NSNL NSNL NN N NN N NN N NN N 100.000 NSLRL NSLRL NN N NN N NN N NN N 200.000 NSNL NSNL NN N NN N NN N NN N

4 10.000 N NSNL N N N N N N N N 30.000 N NSNL N NN N NN N NN N NN 50.000 N NSNL N NN N NN N NN N NN 100.000 N NSNL N NN N NN N NN N NN 200.000 N NSNL N NN N NN N NN N NN

5 10.000 N N N N N N N N N N 30.000 N N N N N N N N N N 50.000 N N N N N N N N N N 100.000 N N N N N N N N N N 200.000 N N N N N N N N N N

6 10.000 NSNL NSNL N NN N NN N NN NN NN 30.000 NSNL NSNL N NN NN NN NN NN NN NN 50.000 NSNL NSNL NN NN NN NN NN NN NN NN 100.000 NSNL NSNL NN NN NN NN NN NN NN NN 200.000 NSNL NSNL NN NN NN NN NN NN NN NN

N: Normal distribution.

NN: Non-normal distribution. NSNL: Non-symmetric and non-limited. NSLRL: Non-symmetric and left and right limited. NSLL: Non-symmetric and left limited.

Table 1 (cont.). Normality study results for the BLS regression coefficients.

Data set 1 presents non-normal distributions mainly due to the high

lack of fit of the data pairs to the regression line. Data sets 2 and 5 present

the best goodness of fit of all the sets, which helps the distribution of the

regression coefficients to be normal. In data set 3, the data structure and the

errors in both axes make the regression line mainly change the intercept

value, which leaves the slope almost unmodified. In this way the intercept

value shows a major uncertainty which leads to a non-normal distribution,

whereas a much lower uncertainty is associated to the slope value. In data

set 4, the slope of the regression line does not follow a normal distribution

since the remarkable heteroscedasticity along the experimental range

causes the regression line to move along a conical-shaped region when


122

considering errors in both axes. This varies the slope and leaves the

intercept almost unmodified. Finally, data set 5 has normal distributions

and data set 6 presents non-normal ones due to the irregular disposition of

the points in the space and the high heteroscedasticity. The more similar

the error pattern to OLS conditions (i.e. larger errors in the y axis than in

the x axis, homoscedasticity) and the better the goodness of fit, the more

normal the distribution is. It has to be pointed out that the Cetama method

was the most sensitive in detecting deviations from normality.

Table 2 shows the quantification of the error made in estimating the

BLS regression coefficients when normality in their distributions is

assumed, and the comparison with the analogous results from OLS and

WLS regression techniques. The error is calculated according to the shaded

areas in Figure 2 (where the error is considered to be the part that belongs

to the OLS, WLS or BLS distribution for a fixed α level and which does not

belong to the real distribution, and the part that does not belong to the

OLS, WLS or BLS distribution for the same α level and belongs to the real

one). This table shows that the error made from assuming normality for the

BLS regression technique is low, and significantly lower than the ones

obtained for the OLS and WLS regression methods for all the data sets. The

data sets that present BLS regression coefficients as normally distributed

have errors equal to zero. We can also see that the error committed when

using the WLS method is usually lower than when using OLS.


123

% Error

Data set Coefficient BLS WLS OLS

1 a 4.69 26.84 58.29

b 4.46 14.59 16.43

2 a 0 9.81 44.35

b 0 5.51 3.66

3 a 0.53 1.37 11.42

b 0.58 6.20 11.03

4 a 0 5.11 88.50

b 2.79 14.97 25.28

5 a 0 0.26 0.62

b 0 0.25 3.28

6 a 2.48 2.31 6.60

b 2.48 3.75 6.45

Table 2. Differences between the theoretical and estimated regression coefficients by the three regression techniques (normal distributions assumed).

Once the BLS regression coefficients have been found, in most cases,

to be non-normally distributed, their distributions were compared with

some theoretical ones (beta, binomial, chi-squared, exponential, F, gamma,

geometric, hypergeometric, normal, Poisson, t-Student, uniform, uniform

discrete and Weibull distributions) using the quantile-quantile plot graphic

method (Q-Q plot) [12]. As the results provided by the Cetama method

(Table 1) indicate that the regression coefficients that do not follow a

normal distribution are mainly non-symmetric and non-limited, it seems

reasonable to suppose that the regression coefficient distributions follow

some kind of constant pattern. However, the results given by the Q-Q plot

indicate that the theoretical distributions that are most similar to the real


124

ones are the chi-squared, normal and t-Student since their differences are

very difficult to appreciate.

β error and sample size validation

Tables 3 and 4 summarise the results from 100,000 iterations using

the Monte Carlo method for the four levels of significance in the twenty

four simulated data sets. Columns 1Ha and

1Hb show the regression

coefficient values which define the chosen bias (distance between H0 and

H1). The values in the βexp column are those from the simulation process,

whereas the values shown in the βpred column are the ones obtained with

the theoretical expressions to be validated (eqs. (6-7)). Finally, the values in

the column npred are the estimated sample sizes of the different simulated

data sets for the different levels of significance.

Uncertainty α(%) 1Ha

0Has βexp. βpred. npred

5 homo. 10 2.4 0.641 9.97 12.91 5 5 3.2 5.02 8.39 5 1 5.2 2.22 5.38 5 0.1 10.5 0.13 2.03 5 hetero. 10 0.7 0.189 10.11 13.67 5 5 0.95 4.32 8.26 5 1 1.5 2.75 6.53 5 0.1 3 0.74 3.14 5 heter. rnd. 10 1 0.261 8.36 11.77 5 5 1.3 4.80 8.48 5 1 2.1 2.23 5.71 5 0.1 4.3 0.11 2.59 5 15 homo. 10 1 0.341 13.24 13.34 15 5 1.3 5.73 6.14 15 1 1.9 0.93 1.19 15 0.1 2.6 0.10 0.24 15 hetero. 10 5e-2 1.69e-2 12.02 12.99 15 5 6.5e-2 4.98 4.9 15

Table 3. Estimated and experimentally obtained probabilities of β error for individual tests on the intercept. Predicted sample size to achieve the α and β probabilities of error for each data set.


125

Uncertainty α(%) 1Ha

0Has βexp. βpred. npred

1 9.5e-2 0.57 1.11 15 0.1 0.125 0.10 0.28 15 heter. rnd. 10 2.5e-2 8.79e-3 13.95 15.12 15 5 3.4e-2 4.39 5.56 15 1 4.5e-2 1.81 2.75 15 0.1 6.4e-2 0.13 0.45 15 30 homo. 10 0.75 0.262 12.93 12.82 30 5 1 4.36 4.43 30 1 1.3 1.74 1.84 30 0.1 1.8 0.12 0.17 30 hetero. 10 5.5e-3 1.92e-3 12.19 12.62 30 5 7e-3 5.53 5.99 30 1 9.5e-3 1.43 1.84 30 0.1 1.2e-2 0.54 0.76 30 heter. rnd. 10 1.9e-2 6.48e-3 11.07 11.46 30 5 2.4e-2 4.97 5.47 30 1 3.2e-2 1.50 1.92 30 0.1 4.3e-2 0.16 0.31 30 100 homo. 10 0.4 0.142 12.78 12.68 100 5 0.5 6.61 6.51 100 1 0.68 1.77 1.70 100 0.1 0.88 0.35 0.32 100 hetero. 10 1.5e-5 5.37e-6 12.89 12.98 100 5 1.9e-5 6.02 6.16 100 1 2.6e-5 1.41 1.45 100 0.1 3.4e-5 0.19 0.20 100 heter. rnd. 10 1.9e-4 6.41e-5 9.49 9.76 100 5 2.4e-4 3.86 4.07 100 1 3e-4 1.91 2.13 100 0.1 4.2e-4 0.07 0.10 100

Table 3 (cont). Estimated and experimentally obtained probabilities of β error for individual tests on the intercept. Predicted sample size to achieve the α and β probabilities of error for each data set.


126

n Uncertainty α (%) 1Hb

0Hbs

1Hbs βexp. βpred. npred.

5 homo. 10 1.45 0.118 0.147 10.39 16.44 5 5 1.6 0.157 5.87 12.60 5 1 2 0.187 3.09 9.87 5 0.1 3.1 0.272 0.62 6.37 5 hetero. 10 1.27 7.48e-2 8.55e-2 12.67 17.64 5 5 1.36 9.02e-2 4.56 10.70 5 1 1.65 0.102 1.11 6.42 5 0.1 2.3 0.132 0.22 4.36 5 heter. rnd. 10 1.27 7.59e-2 9.07e-2 14.41 19.44 5 5 1.4 9.80e-2 3.76 10.24 5 1 1.67 0.113 1.19 6.99 5 0.1 2.35 0.153 0.26 4.78 5 15 homo. 10 0.8 6.92e-2 6.26e-2 10.84 11.91 15 5 0.75 6.11e-2 5.11 6.21 15 1 0.68 5.86e-2 3.75 2.14 15 0.1 0.55 5.58e-2 0.48 0.71 15 hetero. 10 0.93 2.49e-2 2.41e-2 14.59 15.2 15 5 0.91 2.39e-2 6.98 7.73 15 1 0.87 2.34e-2 1.14 1.78 15 0.1 0.83 2.29e-2 0.35 0.72 15 heter. rnd. 10 0.965 1.19e-2 1.16e-2 11.77 12.72 15 5 0.955 1.153e-2 5.07 5.98 15 1 0.94 1.19e-2 1.98 2.74 15 0.1 0.915 1.12e-2 0.15 0.42 15 30 homo. 10 1.12 4.27e-2 4.53e-2 14.92 15.22 30 5 1.16 4.62e-2 5.44 6.38 30 1 1.23 4.78e-2 0.99 1.32 30 0.1 1.32 4.99e-2 0.10 0.14 30 hetero. 10 1.02 7.18e-3 7.25e-3 14.22 14.61 30 5 1.026 7.27e-3 5.92 6.59 30 1 1.036 7.31e-3 1.29 1.77 30 0.1 1.05 7.36e-3 0.082 0.17 30 heter. rnd. 10 1.037 1.26e-2 1.28e-2 10.79 11.62 30 5 1.047 1.29e-2 4.82 5.48 30 1 1.065 1.30e-2 0.95 1.35 30 0.1 1.085 1.31e-2 0.14 0.31 30 100 homo. 10 0.93 2.41e-2 2.32e-2 10.39 9.94 100 5 0.951 2.30e-2 5.81 5.47 100 1 0.89 2.28e-2 2.35 2.13 100 0.1 0.85 2.23e-2 0.16 0.14 100 hetero. 10 0.995 1.89e-3 1.88e-3 15.92 16.16 100 5 0.993 1.88e-3 4.17 4.31 100

Table 4. Estimated and experimentally obtained probabilities of β error for individual tests on the slope. Predicted sample size to achieve the α and β probabilities of error for each data set.


127

n Uncertainty α (%) 1Hb

0Hbs

1Hbs βexp. βpred. npred.

1 0.991 1.87e-3 1.56 1.68 100 0.1 0.988 1.87e-3 0.16 0.18 100 heter. rnd. 10 0.986 4.85e-3 4.82e-3 11.02 11.07 100 5 0.983 4.81e-3 6.45 6.48 100 1 0.979 4.80e-3 4.39 4.48 100 0.1 0.972 4.79e-3 0.81 0.90 100

Table 4 (cont). Estimated and experimentally obtained probabilities of β error for individual tests on the slope. Predicted sample size to achieve the α and β probabilities of error for each data set.

To detect significant differences between the estimated probabilities

of β error and the values from the simulation process, paired t-tests [22]

(with α=1%) were applied on the β error values obtained for the different

number of data points (since it is the most critical factor for achieving good

predictions of probabilities of β error) at the same level of significance. In

this way significant differences between the values in the βexp and βpred

columns were found only in the data sets with five data pairs for the slope

and intercept at the four levels of significance. The possible sources of error

and some important observations concerning the results from the

simulation process can be summarised as follows:

(i) In most cases the predicted probabilities of β error from eqs. (6-7)

are higher than the experimental values from the simulation process. This

overestimation may be due to a lack of information, since the

overestimation is higher in those data sets with fewer data pairs (where the

experimental error, and thus the uncertainty of the regression coefficient is

higher [23]), and lower in those data sets with a larger number of points. In

this latter case however, small disagreements still exist due to the

assumption of the normality of the regression coefficients. Figure 4 plots

the differences between the experimentally-obtained probabilities of β error

(from the simulation process) and the predicted probabilities against the

number of data pairs of each data set for the slope and intercept with a


128

level of significance of 5%. Only the results corresponding to the low range

are shown in Figure 4 since the results for the high range where identical.

0 20 40 60 80 1000

20

40

60

80

100

120

140

160

180

homoscedasticity

heteroscedasticity

random heteroscedasticity

Number of data pairs

%∆β Slope

0 20 40 60 80 1000

10

20

30

40

50

60

70

80

90

100


%∆β Intercept

homoscedasticity

heteroscedasticity

random heteroscedasticity

Figure 4. Difference between the experimentally-obtained probabilities (simulation process) and the predicted probabilities of β error for the slope and the intercept (in percent) in relation to the number of data pairs for each data set.

(ii) Results for the intercept show a higher agreement than the ones

for the slope (Figure 4). This may be because estimating the slope is more

complex since two different distributions have to be considered for 0Hb and

1Hb , whereas only one is needed when the probabilities of β error are

estimated for the intercept, as 1H0H aa ss = .

(iii) There is no clear relationship between the uncertainty patterns

and the error made in predicting the β error (in percent) for the different

simulated data sets. As Figure 4 shows, the three lines depicting the three

patterns of uncertainty do not maintain a constant relative position as they

cross each other. Results for the intercept seem to follow a steadier pattern

for the different uncertainties. As previously stated, the number of data

pairs on the regression line is the key factor for obtaining a better estimate

of the β error.


129

(iv) Results from the predicting the probabilities of β error (eqs. (6-

7)) and sample size (eqs. (12-13)) for data sets with a high linear range were

identical to the ones with a low linear range. Results shown in Tables 3 and

4 correspond to the low linear range, while the ones from the high linear

range have been omitted. These results can be explained because the

distribution of the data pairs in data sets (for a given uncertainty and

number of data pairs) with different linear ranges is identical. So the only

difference between data sets with different linear ranges is that the values

of the individual data pairs and their respective uncertainties (taken as

standard deviations) are ten times higher in the high linear range than in

the low linear range. Only the standard deviation values for the intercept

were exactly ten times higher in the high linear range than the ones in the

low linear range. This is due to the direct dependence of the standard

deviation for the intercept on the sum of the x-axis values (eq. (8)).

If we look at the results of estimating the sample size in Tables 3

and 4 (npred columns), we can see that the predicted results in all cases

provide the correct number of data pairs of the different initial data sets

considered. From these results we can conclude that the expressions for

estimating the sample size provide correct results for the three kinds of

distribution of uncertainties considered.

Procedure for β error and estimation of sample size in a real data set

Table 5 summarises the results of estimating the probabilities of

committing a β error in the individual tests for the BLS slope and intercept

for a level of significance of 5% (β column, in percent) for data set 3.

Columns aa −0H and bb −

0H show the distance between the estimated

regression coefficients and the reference values ( 00H =a and 1

0H =b ). The

columns 0Hast ⋅ and

0Hbst ⋅ (α=5%) show the values of the confidence


130

intervals associated to the reference values. Columns 1Ha and

1Hb represent

the bias that the experimenter wants to check in the regression coefficient

being tested. Bias is detected in the regression coefficient whenever the

difference aa −0H and bb −

0H is higher than its associated confidence

interval. Probabilities of β error are not calculated if bias is detected.

aa −0H

0Hast ⋅

1Ha β

BLS 2.94 5.35 40.2 WLS 4.38 5.19 6 37.6 OLS 3.97 7.11 62.5

bb −0H

0Hbst ⋅

1Hb β

BLS 0.0364 0.0991 2.77 WLS 0.0571 0.100 1.2 2.60 OLS 0.0656 0.110 5.30

Table 5. Results obtained in estimating the probability of β error in the individual tests for the intercept and the slope in data set 3.

Table 5 shows that neither constant nor proportional bias are found

in the SIA methodology in the analysis of Ca(II) in water according to the

results from the three regression techniques. The highest probability of β

error is estimated at 62.5% for the OLS technique, due to the highest

standard deviation value. On the other hand, the probabilities of β error for

BLS and WLS are lower and similar to each other although the WLS

intercept value is nearer the upper confidence interval limit. This means

that the results are less reliable, although this is not reflected in the

estimated probabilities of β error. Results for the slope show that the

estimated probabilities of β error in the three cases are very similar, despite

the differences in the slope values from the three regression methods.

However, if we look at the slope values we can be more confident about the

accuracy of the one estimated by the BLS method as it is the closest to the

reference value 0Hb .


131

The process for estimating the sample size to achieve the calculated

probabilities of β error in the slope (2.77%) and intercept (40.2%) for a level

of significance of 5% is shown in Table 6. For the intercept, starting with an

initial data set of five data pairs ( na0column), thirteen iterations were

needed to end up with twenty-seven data pairs. For the slope, twenty-six

data pairs were needed to achieve convergence and there was no estimate

of the data pairs until 13 had been considered ( nb0column) since, according

to the denominator of eq. (13), high experimental errors may produce

negative estimates of sample size for the slope (denoted by <0 in Table 6).

Iteration 0bn

0Hˆbs

1Hˆbs nbf

na0

0Hˆas na f

1 5 0.0974 0.0992 <0 5 6.369 9 2 9 0.131 0.134 <0 9 3.694 11 3 13 0.0753 0.0769 18 11 3.511 13 4 18 0.0666 0.0678 22 13 3.728 16 5 22 0.0609 0.0622 24 16 3.403 18 6 24 0.0530 0.0542 25 18 3.391 20 7 25 0.0511 0.0522 26 20 3.199 22 8 26 0.0492 0.0502 26 22 3.103 23 9 23 3.103 24

10 24 2.954 25 11 25 2.887 26 12 26 2.838 27 13 27 2.657 27

Table 6. Iterations during estimation of the sample size for a and b (data set 3).

CONCLUSIONS

The results of this work show that, in spite of the non-normality of

the distributions of the BLS regression coefficients, the errors made in the

calculating the confidence intervals for the BLS regression coefficients are

lower than the ones made with OLS or WLS techniques for data with

uncertainties in both axes. Thus, the probability of β error in the individual


132

tests on the BLS regression coefficients can be estimated under the

hypothesis of normality.

We have also demonstrated that the expressions for estimating the

probability of committing a β error when testing an individual regression

coefficient with the BLS regression technique and considering different

distributions for the reference (0Ha or

0Hb ) and for the biased (1Ha or

1Hb )

regression coefficients, provide correct results. Some sources of error have

also been detected and identified to explain the disagreements produced in

validating the results. The number of data pairs of the regression line

appear to be crucial for better estimating the probability of β error. In

addition, results in real data show that in some cases it may be interesting

to calculate the probability of β error not with the set α threshold value, but

with the maximum level of significance α for which no bias is detected in

the regression coefficient. One would be more confident of the regression

coefficient value being accurate than when it falls near one of the

boundaries of the confidence interval (in this way the probabilities of α

error would be higher but the probabilities of β error would be lower than

in the usual way).

Finally, we found that it is advisable to estimate the sample size,

since it allows the experimenter to control the probabilities of committing α

and β errors that they consider reasonable for the analytical problem in

question. The iterative process for estimating the sample size guaranteed

the chosen probabilities of making α and β errors when an individual test is

applied to one of the estimated BLS coefficients and produced correct

results for those data sets with moderate heteroscedasticity, but not for

those with high heteroscedasticity. The experimenter also has to weigh up

the pros and cons of performing the discontinuous series of experiments

that this iterative procedure requires.


133

ACKNOWLEDGMENTS

We would like to thank the DGICyT (project no. BP96-1008) for

financial support, and the Rovira i Virgili University for providing a

doctoral fellowship to A. Martínez and F. J. del Río.

BIBLIOGRAPHY

1.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York,

1987.

2.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van

Nostrand Reinhold, New York, 1987.

3.- M.A. Creasy, Confidence limits for the gradient in linear in the linear

functional relationship, J. Roy. Stat. Soc. B 18 (1956) 65-69.

4.- J. Mandel, Fitting straight lines when both variables are subject to error,

J. Qual. Tech. 16 (1984) 16 1-14.

5.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, D.L. Massart,

Detection of bias in method comparison by regression analysis, Anal.

Chim. Acta 338 (1997) 19-40.

6.- J.M. Lisý, A. Cholvadová, J. Kutej, Multiple straight-line least-squares

analysis with uncertainties in all variables, Comput. Chem. 14 (1990)

189-192.

7.- J. Riu, F.X. Rius, Univariate regression models with errors in both axes, J.

Chemom. 9 (1995) 343-362.

8.- J. Riu, F.X. Rius, Assessing the accuracy of analyticas methods using

linear regression with errors in both axes, Anal. Chem. 68 (1996) 1851-

1857.

9.- A.H. Kalantar, R.I. Gelb, J.S. Alper, Biases in summary statistics of

slopes and intercepts in linear regression with errors in both variables,

Talanta 42 (1995) 597-603.

10.- Cetama, Statistique appliquée à l’exploitation des mesures, 2nd ed.,

Masson, Paris, 1986.


134

11.- G. Kateman and L. Buydens, Quality Control in Analytical Chemistry,

2nd ed., John Wiley & Sons, New York, 1993.

12.- M. Meloun, J. Militký and M. Forina, Chemometrics for Analytical

Chemistry. Volume 1: PC-aided statistical data analysis, Ellis Horwood

ltd., Chichester, 1992.

13.- M.R. Spiegel, Theory and Problems of Statistics; McGraw-Hill, New

York, 1988.

14.- O. Güell, J.A. Holcombe, Analytical applications of Monte Carlo

techniques, Anal Chem. 62 (1990) 529A - 542A.

15.- J.J. Langenfeld, S.B. Hawthorne, D.J. Miller, J. Pawliszyn, Role of

modifiers for analytical-scale supercritical fluid extraction of

environmental samples, Anal. Chem. 66 (1994) 909-916.

16.- I. Saouter, B. Blattmann, Analyses of organic and inorganic mercury by

atomic fluorescence spectrometry using a semiautomatic analytical

system, Anal. Chem. 66 (1994) 2031-2037.

17.- I. Ruisánchez, A. Rius, M.S. Larrechi, M.P. Callao, F.X. Rius, Automatic

simultaneous determination of Ca and Mg in natural waters with no

interference separation, Chemom. Intell. Lab. Syst. 24 (1994) 55-63.

18.- R. Boqué, F.X. Rius, D.L. Massart, Straight line calibration: something

more than slopes, intercepts and correlation coefficients, J. Chem.

Educ. (Comput. Ser.) 71 (1994) 230-232.

19.- B.D. Ripley, M. Thompson, Regression techniques for the detection of

analytical bias, Analyst 112 (1987) 337-383.

20.- P.J. Ogren, J.R. Norton, Applying a simple linear least-squares

algorithm to data with uncertainties in both variables, J. Chem. Educ.

69 (1992) 130-131.

21.- V. López-Ávila, R. Young, F.W. Beckert, Microwave-assisted extraction

of organic compounds from standard reference soils and sediments,

Anal. Chem. 66 (1994) 1097-1106.

22.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.

Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and

Qualimetrics: Part A, Elsevier, Amsterdam, 1997.


135

23.- G.J. Hahn, W. Q. Meeker. Statistical Intervals, a guide for practitioners,

John Wiley & Sons, New York, 1991.


136

3.5 Conclusions

D’entre les conclusions presentades en l’apartat 3.4, cal destacar-ne

dues com a objectius principals d’aquest capítol. Per una banda, remarcar

la importància de considerar les probabilitats de cometre un error β en

l’aplicació d’un test individual sobre un dels coeficients de regressió. Com

s’ha esmentat en repetides ocasions, les conseqüències d’assumir

probabilitats elevades de cometre aquest tipus d’errors poden arribar a ser,

segons el problema analític, força greus.

Per una altra banda, també cal insistir en els avantatges que

introdueix el càlcul del nombre de mostres per construir la recta de

regressió mitjançant el mètode BLS. Aquest procediment permet estimar el

nombre de mostres que s’han de mesurar per construir la recta de regressió

BLS de manera que el risc de cometre errors α i β a l’hora de detectar un

cert biaix en un dels coeficients de regressió mitjançant un test individual

estigui controlat. Tot i els inconvenients que presenta aquest procediment

iteratiu (descrits en l’apartat 3.4) el seu ús és recomanat en aquells

problemes analítics en què les conseqüències de cometre errors de tipus α

i/o β puguin ser especialment problemàtiques.

3.6 Referències

1.- Kalantar A.H., Gelb R.I., Alper J.S., Talanta, 42 (1995) 597-603.

2.- Cetama, Statistique appliquée à l’exploitation des mesures 2nd ed., Masson:

Paris, 1986.






CAPÍTOL 4

Detecció del biaix en mètodes analítics per la determinació de múltiples analits simultàniament.

Probabilitat de cometre un error de tipus β


139


En el capítol anterior s’ha incidit en la importància de l’estimació de

les probabilitats de cometre un error β en l’aplicació de tests individuals

sobre els coeficients de la recta de regressió BLS. També es va demostrar la

utilitat de l’estimació del nombre de punts necessaris per construir la recta

de regressió BLS, per poder detectar un cert biaix ∆ en un dels coeficients

de regressió mitjançant un test individual, amb unes probabilitats de

cometre errors α i β determinades. En aquest capítol es continua tractant

sobre l’estimació de la probabilitat de cometre un error β i les seves

conseqüències en l’anàlisi química, però en aquest cas per al test conjunt

sobre l’ordenada a l’origen i el pendent de la recta de regressió BLS.

Amb el test conjunt sobre l’ordenada i el pendent de la recta BLS és

possible detectar errors significatius en els resultats de l’anàlisi d’un sol

analit a diferents nivells de concentració mitjançant un nou mètode analític,

en comparació amb els d’un mètode de referència.1 No obstant així, hi ha

una gran varietat de mètodes capaços de determinar la concentració de

diferents analits alhora (mètodes cromatogràfics, electroforètics, etc.), que

tenen una gran importància ja que es troben àmpliament estesos en el camp

de l’anàlisi química. Per aquest motiu, en l’apartat 4.2 es demostra la

capacitat del test conjunt per l’ordenada i el pendent de la recta BLS per

detectar errors significatius en els resultats de mètodes analítics que

determinen diferents analits simultàniament. També s’estudien les

conseqüències d’aplicar el test de confiança conjunta sobre els resultats de

cada analit per separat o sobre els resultats de tots els analits alhora. D’altra

banda, en els apartats 4.4 i 4.5 es presenta el fonament teòric i les

expressions matemàtiques que permeten comprendre i estimar la

probabilitat de cometre un error de tipus β en aplicar el test conjunt sobre

els coeficients de regressió BLS.

Capítol 4. Detecció del biaix en mètodes analítics ...

140

Als apartats 4.3 i 4.5 es presenta la major part del treball realitzat per

aquest capítol, com a part dels articles Validation of multianalyte

determination methods. Application to RP-HPLC derivatizing methodologies,

publicat en la revista Analytica Chimica Acta, i Evaluating bias in method

comparison studies using linear regression with errors in both axes, enviat a la

revista Journal of Chemometrics (en revisió). Per acabar, a l’apartat 4.6 es

presenten les conclusions del capítol.

4.2 Comparació de mètodes analítics

En els processos de comparació de dos mètodes analítics (on

normalment un dels dos és un mètode de referència) mitjançant regressió

lineal, es representen els resultats procedents de l’anàlisi d’un analit en una

sèrie de mostres amb diferents nivells de concentració pels dos mètodes en

comparació. En general els resultats del mètode de referència se situen

sobre l’eix de les ordenades i els obtinguts pel nou mètode (mètode

candidat), sobre l’eix de les abscisses. Com que les incerteses degudes als

errors comesos en la mesura de les diferents mostres mitjançant els dos

mètodes acostumen a ser del mateix ordre de magnitud, és convenient

utilitzar el mètode de regressió BLS (vegeu apartat 1.6 de la Introducció).

En el cas hipotètic que els resultats obtinguts pels dos mètodes fossin

idèntics, l’ordenada a l’origen i el pendent de la recta de regressió serien

igual a 0 i a 1 respectivament. En els conjunts de dades reals, les mesures

poden estan afectades tant per errors aleatoris com sistemàtics que fan que

els coeficients de regressió siguin diferents dels valors teòrics. Per poder

determinar si les diferències entre els coeficients de la recta de regressió

BLS i els valors de referència (0 per l’ordenada i 1 pel pendent) són

significatives, cal aplicar un test conjunt per a l’ordenada a l’origen i el

pendent que tingui en compte les incerteses associades als resultats

obtinguts per ambdós mètodes.1 A l’apartat 4.3 es presenta l’expressió que

genera aquest interval de confiança conjunta pel mètode de regressió BLS


141

(eq. 4). Aquest test va ser desenvolupat inicialment pel mètode OLS per

Mandel i Linnig2 i defineix per a un nivell de significança α un interval de

confiança el·líptic, que està centrat sobre el punt definit pels coeficients de

regressió estimats (b0,b1) . En el cas que el punt definit pels valors de

referència (0,1) caigui dintre d’aquest interval de confiança, es considera

que no existeixen diferències significatives entre els coeficients estimats b0 i b1 i els valors de referència 0 i 1 respectivament.

No obstant això, per ser més coherent amb la definició dels tests

d’hipòtesi3 en el sentit d’acceptar o rebutjar la hipòtesi nul·la (H0), l’interval

de confiança conjunta hauria d’estar centrat sobre el punt de referència (0,1)

definit pels coeficients teòrics a partir dels quals es postula H0. Es considera

que no existeixen diferències significatives respecte als coeficients de

regressió teòrics (s’accepta H0) per qualsevol dels possibles valors dels

coeficients de regressió que caiguin dintre d’aquest interval de confiança

conjunt per a un determinat nivell de significança α. La mida d’aquest

interval de confiança conjunta és definit fonamentalment per tres

paràmetres: l’estimació de l’error experimental s2, les variàncies dels

coeficients de regressió i el nivell de significança α escollit. Com es pot

observar a la figura 4.1, l’interval de confiança el·líptic està inclinat a causa

de la correlació negativa entre l’ordenada a l’origen i el pendent de la recta

de regressió, típica en processos de comparació de mètodes.


142

0)2,1( bn st −−α

1)2,1( bn st −−α

0b

1b

(0,1)

(b0,b1)

Figura 4.1. Intervals de confiança individuals i conjunt per un nivell de

significança α.

Com es pot comprovar a la figura 4.1, l’ús d’intervals de confiança

individuals (eq. 3.1 i 3.2 del capítol 3) per determinar si les diferències entre

els coeficients de regressió BLS i els valors de referència són significatives

no és correcte, ja que aquests intervals individuals no tenen en compte la

correlació entre l’ordenada a l’origen i el pendent de la recta de regressió.4

És important destacar que el test conjunt per a l’ordenada a l’origen i el

pendent està basat en la hipòtesi de normalitat dels coeficients de regressió

BLS. Com ja es va mostrar en el capítol anterior, tot i que els coeficients de

regressió BLS no segueixen una distribució normal de forma rigorosa,

l’error comès en assumir la hipòtesi de normalitat no és significativament

gran.


143

4.2.1 Determinació de diversos analits simultàniament

Un dels objectius del capítol ha estat demostrar que quan es treballa

amb mètodes analítics que determinen diversos analits simultàniament,

l’aplicació del test conjunt per detectar errors significatius en els resultats

del mètode candidat s’ha de fer sobre els coeficients de regressió BLS

obtinguts considerant els resultats de tots els analits simultàniament. En

aquest cas, el nombre de punts que s’han de considerar per construir

l’interval de confiança conjunta (variable n en l’equació 4 de l’apartat 4.3)

correspondrà al nombre de nivells de concentració multiplicat pel nombre

d’analits determinats simultàniament. Per demostrar que en aquests casos

el test conjunt aplicat sobre els coeficients de regressió BLS detecta

correctament errors significatius en els resultats del mètode candidat quan

existeixen i no en detecta quan no existeixen, es van utilitzar conjunts de

dades simulats. Aquests conjunts de dades s’han generat amb el mètode de

Monte Carlo5,6 de forma similar a la explicada en el capítol 2 (figura 2.2). Els

conjunts de dades inicials, em què tots els punts es troben perfectament

alineats sobre una línia recta i a partir dels quals es generen els nous

conjunts simulats, es poden dividir en dos grups. Per una banda, els

conjunts inicials que simulen els resultats obtinguts pels diferents analits de

forma individual. Per una altra, quan tots els parells de dades

corresponents a cada analit individual s’uneixen en un sol conjunt,

s’obtenen els conjunts de dades globals. La figura següent mostra els

conjunts de dades inicials considerats en funció del grau

d’heteroscedasticitat: a) homoscedasticitat, b) heteroscedasticitat constant i

c) heteroscedasticitat aleatòria. En els casos a i b la recta de regressió

presenta una ordenada i un pendent igual a 0 i 1 respectivament, de

manera que se simula un cas en què els resultats del dos mètodes són

idèntics. Pel contrari, en el cas c se simula que els resultats dels dos

mètodes en comparació són diferents, ja que l’ordenada és igual a 0 però el

valor del pendent és de 1.05.


144

a)

Mètode Alternatiu

Mèt

ode

de R

e fe r

ènc i

e a

Analit 1

Mètode Alternatiu

Mèt

ode

de R

efe r

è nc i

e a

Analit 2

Mètode Alternatiu

Mèt

ode

de R

efe r

è nc i

e a

Analit 3

Mètode Alternatiu

Mèt

ode

de R

e fe r

è nc i

e a

Analit 4

Mètode Alternatiu

Mèt

ode

de R

e fe r

è nc i

e a

Analit 5

Mètode Alternatiu

Mèt

ode

de R

e fe r

è nc i

e a

Analit 6

Mètode Alternatiu

Mèt

ode

de R

efe r

è nc i

e a

Analit 7

Mètode Alternatiu

Mèt

ode

de R

efe r

è nc i

e a

Conjunt Global

b)

Mè t

ode

de R

efe r

è nc i

e a

Analit 1

Mè t

ode

de R

efe r

è nc i

e a

Analit 2

Mè t

ode

de R

e fe r

ènc i

e a

Analit 3M

è tod

e de

Re f

e rèn

c ie a

Analit 4

Mè t

ode

de R

efe r

è nc i

e a

Analit 5

Mè t

ode

de R

e fe r

è nc i

e a

Analit 6

Mè t

ode

de R

e fe r

ènc i

e a

Analit 7

Mè t

ode

de R

e fer

è nc i

ea

Conjunt Global

Mètode Alternatiu Mètode Alternatiu Mètode Alternatiu Mètode Alternatiu


c)

Mèt

ode

de R

e fer

è nci

e a

Analit 1

Mèt

ode

de R

efe r

è nc i

e a

Analit 2

Mèt

ode

de R

e fe r

è nc i

e a

Analit 3

Mèt

ode

de R

e fe r

è nc i

e a

Analit 4



145

Mèt

ode

de R

e fe r

è nc i

e a

Analit 5

Mè t

ode

de R

e fe r

è nc i

e a

Analit 6

Mèt

ode

de R

e fer

è nci

e a

Analit 7

Mèt

ode

de R

e fe r

è nc i

e a

Conjunt Global


Figura 4.2. Conjunts de dades inicials individuals i globals a partir dels quals es generen conjunts de dades simulats mitjançant el mètode de Monte Carlo.

A partir de cada un dels conjunts globals inicials se’n generen

100.000 conjunts simulats. El test conjunt per l’ordenada a l’origen i el

pendent s’aplica sobre els coeficients de regressió BLS obtinguts per

cadascun d’aquests conjunts simulats. La taula 1 recull el percentatge de

vegades en què no es van detectar diferències significatives entre els

resultats dels dos mètodes en cadascun dels tres conjunts de dades globals,

per diferents nivells de significança α.

Incertesa 1-α (%) % Monte Carlo

90 88.78 95 94.47 99 98.95

homoscedasticitat

99.9 99.89

90 88.95 95 94.62 99 98.96

heteroscedasticitat constant

99.9 99.90

90 2.28 95 5.01 99 18.03

heteroscedasticitat aleatòria

99.9 47.75

Taula 1. Percentatges dels 100.000 conjunts de dades globals simulats pels quals no es detecten diferències significatives entre els resultats dels dos mètodes.


146

Com es pot veure en aquesta taula, en els casos en què els resultats dels dos

mètodes en comparació en el conjunt inicial són idèntics (a i b), el

percentatge de vegades en què no es detecten diferències significatives és

similar al nivell de significança α fixat en cada cas. D’altra banda, en els cas

en què els resultats dels dos mètodes en el conjunt inicial són diferents (c),

els percentatges de vegades en què no es detecten diferències significatives

són molt inferiors als nivells de significança α fixats. D’aquests resultats es

pot concloure que l’aplicació del test conjunt sobre els coeficients de

regressió BLS estimats considerant simultàniament els resultats de l’anàlisi

de tots els analits, proporciona conclusions correctes sobre l’existència de

diferències significatives entre els resultats dels dos mètodes en

comparació. Això és perquè es detecten diferències significatives en un

percentatge elevat quan els resultats no són comparables i no se’n detecten

(es detecten en un α% de les vegades) quan els resultats sí ho són.

A banda de conèixer les conclusions sobre l’existència de diferències

significatives entre els dos mètodes en comparació considerant els conjunts

de dades globals, és interessant saber les conclusions que es poden extreure

respecte l’existència de diferències significatives entre els resultats dels dos

mètodes aplicant el test conjunt sobre els coeficients de regressió BLS

estimats a partir de cadascun dels 700.000 conjunts individuals. Aquests

conjunts de dades contenen els resultats de l’anàlisi de cada un dels set

analits simulats i la seva unió dóna lloc a cadascun dels 100.000 conjunts

globals als quals es fa referència en la taula 1. Es considera que hi ha

diferències significatives en els resultats dels dos mètodes si en més de la

meitat dels set conjunts individuals que formen un conjunt global en

cadascuna de les 100.000 iteracions es detecta que els coeficients de

regressió BLS són significativament diferents dels valors teòrics 0 i 1. La

figura següent mostra esquemàticament aquest procés de simulació.


147

Conjunts Globals on no es detecten diferències significat.

100.000 Conjunts de dades Globals

Conjunts de dades Individuals

Amb 4 o més conjunts de dades individuals on NO es detecten diferències significatives (cas a1 a la Taula 2)

(0,1)

Conjunts Globals on es detecten diferències significatives.

Conjunts de dades Individuals

Amb 4 o més conjunts de dades individuals on SÍ es detecten diferències significatives (cas a2 a la Taula 2)

Amb 4 o més conjunts de dades individuals on NO es detecten diferències significatives (cas b1 a la Taula 2)

Amb 4 o més conjunts de dades individuals on SÍ es detecten diferències significatives (cas b2 a la Taula 2)

(cas a a la Taula 2)

(cas b a la Taula 2)

(0,1) (0,1) (0,1) (0,1)

(0,1) (0,1) (0,1)

(0,1) (0,1)

(0,1) (0,1)

(0,1) (0,1)

(0,1)

Figura 4.3. Esquema del procés de simulació per detectar diferències significatives entre els resultats de dos mètodes, mitjançant el test conjunt sobre els coeficients de regressió BLS, estimats considerant els resultats de tots els analits simultàniament o cadascun dels analits individualment.

La taula 2 mostra els resultats obtinguts en l’aplicació del test

conjunt a diferents nivells de significança sobre els coeficients de regressió

BLS, obtinguts considerant els conjunts de dades individuals generats a

partir dels tres tipus diferents de conjunts de dades inicials. Hi ha quatre

possibles situacions quant a l’existència de diferències significatives en els

conjunts individuals:


148

Conjunts Globals

Cas a Cas b

Conjunts Individuals Conjunts Individuals

Conjunt de dades

1-α (%) Cas a1 Cas a2 Cas b1 Cas b2

90 99.65 0.34 98.88 1.12

1 95 99.98 0.02 99.90 0.09

99 100 0 100 0

99.9 100 0 100 0 90 99.65 0.35 99.19 0.81

2 95 99.98 0.02 99.86 0.14

99 100 0 100 0

99.9 100 0 100 0 90 99.56 0.44 94.91 5.09

3 95 99.96 0.04 99.40 0.6

99 100 0 100 0

99.9 100 0 100 0

Taula 2. Percentatges dels conjunts de dades individuals que formen els conjunts globals a què es fa referència en la taula 1, segons si es pot concloure l’existència o l’absència de diferències significatives en els resultats dels dos mètodes.

a) En aquells conjunts de dades globals en què no s’han detectat

diferències significatives entre els resultats obtinguts pels dos mètodes

(vegeu-ne percentatges a la taula 1), poden donar-se dues possibilitats en

aplicar el test conjunt sobre els coeficients de la recta de regressió BLS

estimats per cadascun dels conjunts individuals:

a1) No es detecten diferències significatives en els resultats

continguts en quatre o més conjunts individuals.

a2) Sí es detecten diferències significatives en els resultats



149

b) En aquells conjunts de dades globals en què sí es detecten

diferències significatives entre els resultats dels dos mètodes (vegeu-ne

percentatges a la taula 1) poden donar-se les mateixes dues possibilitats

descrites en l’apartat a) en aplicar el test conjunt sobre els coeficients de la

recta de regressió BLS estimats per cadascun dels conjunts individuals:

b1) No es detecten diferències significatives en els resultats


b2) Sí es detecten diferències significatives en els resultats


Els resultats obtinguts pel cas a1 en la taula 2 demostren que no es

detecten correctament diferències significatives entre els resultats dels dos

mètodes en comparació aplicant el test conjunt sobre els coeficients de

regressió BLS, ja siguin estimats considerant els resultats de tots els analits

conjuntament o els de cada analit individualment. D’altra banda, l’alt

percentatge obtingut en el cas b1 pels tres tipus de conjunts inicials

demostra que la detecció de diferències significatives és molt difícil quan el

test conjunt s’aplica sobre els coeficients de regressió BLS estimats a partir

de cadascun dels diferents conjunts de dades individuals. Això passa

perquè a que en els conjunts de dades on el nombre de punts és petit,

l’error experimental s2 (eq. 1.37) té una probabilitat més gran d’estar

sobreestimat.7 Atès que la mida de l’interval de confiança conjunta pel

mètode de regressió BLS és directament proporcional a la magnitud de

l’error experimental, una sobreestimació de s2 farà que l’interval de

confiança conjunt estigui sobredimensionat.

Així doncs, quan es treballa amb conjunts de dades individuals la

probabilitat de no detectar diferències significatives entre els resultats de

dos mètodes quan realment existeixen, és a dir, d’acceptar H0 quan la

hipòtesi correcta és H1 (probabilitat d’error β) és elevada. Per aquest motiu,


150

a fi d’evitar una probabilitat d’error β gran en la comparació dels resultats

de dos mètodes analítics capaços de determinar diversos analits alhora, cal

considerar tots els valors experimentals per estimar els coeficients de

regressió BLS sobre els quals s’aplica el test conjunt. En l’apartat 4.3 es

presenta un exemple de comparació de metodologies analítiques per

determinar diversos analits simultàniament, en el cas de les tècniques de

cromatografia d’alta resolució en fase reversa per la determinació d’amines

biògenes en vins.

4.3 Analytica Chimica Acta, 406 (2000) 257-278

151

4.3 Validation of bias in multianalyte determination methods.

Application to RP-HPLC derivatizing methodologies.

(Analytica Chimica Acta 406 (2000) 257-278)

Àngel Martínez1*, Jordi Riu1, Olga Busto2, Josep Guasch2 and F. Xavier Rius1

1. Departament de Química Analítica i Química Orgànica. Institut

d’estudis Avançats. Facultat de Química. Universitat Rovira i Virgili.

2. Departament de Química Analítica i Química Orgànica. Unitat

d'Enologia del CeRTA. Universitat Rovira i Virgili.

Keywords

HPLC, biogenic amines, method validation, linear regression, joint

confidence interval.

ABSTRACT

This paper reports a new approach for validating bias in analytical

methods that provide simultaneous results on multiple analytes. The

validation process is based on a linear regression technique taking into

account errors in both axes. The validation approach is used to individually

compare two different chromatographic methods with a reference one.

Each of the two methods to be tested is applied on a different set of data

composed of two real data sets each. In addition, three different kinds of

simulated data sets were used. All three methods are based on RP-HPLC

and are used to quantify eight biogenic amines in wine. The two methods

to be tested use different derivatizatizing procedures; precolumn 6-

aminoquinolyl-n-hydroxysuccinimidyl carbamate (AQC) and oncolumn o-

phtalaldehyde (OPA) respectively. On the other hand, the reference


152

method uses derivatization with OPA precolumn. Various analytes are

determined in a set of samples using each of the methods to be tested and

their results are independently regressed against the results of the reference

method. Bias is detected in the methods to be tested by applying the joint

confidence interval test to the slope and the intercept of the regression line

which takes into account uncertainties in the two methods being compared.

The conclusions about the trueness of the two methods being tested varied

according to whether the joint confidence interval test was applied to data

obtained from various biogenic amines considered simultaneously or

individually.

INTRODUCTION

Biogenic amines need to be determined in fermented beverages

because they are potentially toxic when consumed in large amounts [1].

Many methods for quantifying the biogenic amine content in food have

been described, (i.e. gas chromatography [2,3] and HPLC techniques [4-6]).

However, procedures based on RP-HPLC have commonly been used as the

amines can be automatically injected [7], even without previous treatment

of the samples. Most of the RP-HPLC analytical methods used to determine

biogenic amines are based on derivatization reactions which improve the

selectivity and sensitivity of the different procedures.

The Office International de la Vigne et du Vin (OIV) has yet to

propose an official method of analysing biogenic amines in wines.

However, the most used method [7,8] and the only one that has been

validated for this kind of analysis [9], is the method that uses precolumn

derivatization with OPA. For this reason, in this study, this method has

been taken as the reference method. Nevertheless, alternative methods can

be used to analyse biogenic amines in wines, since they have some

advantages. For instance, the use of AQC as derivatizing reagent [10]

provides more stable compounds and greater selectivity than the method


153

using OPA. On the other hand, the oncolumn derivatization with OPA [11]

does not require the use of an automatic injector, what makes it more

affordable without loss of sensitivity. As an essential step in the validation

process [12] of the methods using AQC precolumn and OPA oncolumn as

derivatizing reagents, bias must be evaluated in order to assess the trueness

[13] of the methodologies. On the other hand, the evaluation of the

precision is as well a relevant and complementary matter that will be

addressed in future works.

The trueness of the two methodologies to be tested (AQC

precolumn and OPA oncolumn derivatization) at different concentration

levels can be assessed by comparing each of them to a reference method

(OPA precolumn derivatization), using linear regression. The analytical

results obtained by applying each one of the methodologies to be tested to

a set of samples containing the biogenic amines at different concentration

levels are regressed on the results obtained by the reference method from

the same set of samples. In this way, a straight line is expected. Should the

slope and the intercept values of the straight line not both be significantly

different from unity and zero respectively, the methodology being tested

can be considered to be comparable to the OPA precolumn derivatization

one throughout the specific range. This comparison can be performed using

the joint confidence interval test for the slope and the intercept [14].

Traditionally, in order to compare analytical methods using linear

regression, the ordinary least squares (OLS) technique has been used to

find the regression coefficients. However, this technique only assumes the

presence of constant random errors (homoscedasticity) in one of the

methods in comparison, usually in the method being tested represented on

the y-axis, considering the other method free of random errors. Since non

constant errors (heteroscedasticity) are normally present in both methods,

bivariate least squares (BLS) regression techniques which consider errors in


154

both axes should be used. Recently a joint confidence interval test for the

slope and intercept which considers both homoscedastic and

heteroscedastic errors in both methods has been developed [15].

However, so far, only a single analyte per sample has been

considered when comparing methodologies at multiple concentration

levels using linear regression. Because the RP-HPLC methodologies make it

possible to analyse multiple analytes, in this paper we extend the technique

for assessing bias by method comparison using linear regression taking

into account the errors in both axes to multiple analytical determinations.

In this way, the joint confidence interval test can be applied to the

regression coefficients of the BLS straight line, which is obtained

considering the information of all the different amines in the sample, so

bias in the method to be tested considering all the analytes can be detected.

Four real data sets containing chromatographic data of eight

biogenic amines in red, white and rosé wines from the province of

Tarragona (Spain) were used to check the existence of bias in the two RP-

HPLC derivatizing methods to be tested. It is shown that the conclusions

drawn from statistically analysing the data from individual amines cannot

be used to asses the validity of the AQC precolumn and OPA oncolumn

derivatization methodologies for analysing the whole set of biogenic

amines.


Notation

The true values of the bivariate least-squares regression coefficients

are represented by a (intercept) and b (slope), while their respective

estimates are denoted as a and b . The number of experimental data pairs


155

(xi,yi) from the analysis of the different analytes at the different levels of

concentration, is denoted as n. The experimental error associated to the

regression line (i.e. residual error), expressed in terms of variance for the

experimental data pairs, is referred to as s2, while its estimate will be s2 .

Likewise, yi will represent the estimated value of yi.

Bivariate least-squares regression (BLS)

Bivariate least-squares is the generic name given to a set of straight

line regression techniques applied to data containing errors in both axes. Of

all the approaches for calculating the regression coefficients, Lisý´s method

is one of the most suitable [16]. The regression technique minimises the sum

of the weighted residuals, S, expressed in eq. 1:

∑=

−=

n

i i

ii

wyyS

1

2)ˆ( (1)

where the weighting factor wi, takes into account the variances of each

(xi,yi) data pair, sxi

2 and syi

2 , according to eq 2:

w s b si y xi i= +2 2 2 (2)

and the estimate of the experimental error is defined as:

2

)ˆ(

ˆ 1

2

2

−

−

=∑

=

nw

yy

s

n

i i

ii

(3)


156

Therefore, the bivariate regression method assigns larger weights

(i.e less importance) to those data pairs with larger sxi

2 and/or syi

2 values,

that is to say, the most imprecise data pairs.

Minimising the sum of the weighted residuals, two non-linear

equations are obtained, from which the regression coefficients a and b can

be obtained by means of a quick iterative process [17].

Joint confidence interval test

In order to compare two multianalyte methodologies using

bivariate linear regression, the analytical results obtained from a set of

samples by the method to be tested are regressed on those obtained by the

already established methodology. Different analytes at different

concentration levels are considered to generate the straight line. If neither

of the straight line regression coefficients statistically differ from unity

slope and zero intercept, the results produced by the two methodologies

will not be considered to be statistically different at a given level of

significance α.

The joint confidence interval test for the slope and the intercept

considering errors in both axes [15] is used to test whether there are

significant differences between the regression coefficients and the

theoretical values of zero intercept and unity slope. This consists of

checking the presence of the theoretical point zero intercept and unity slope

within the limits of the elliptical-shaped joint confidence region defined by

eq. 4:

12 2

1

2

1

22 2

1 2 21w

a axw

a a b bxw

b b s Fii

ni

ii

ni

in

i

n

= =− −

=∑ ∑ ∑− + − − + − =( ) ( )( ) ( ) ( , )α (4)


157

where F n1 2 2− −α ( , ) is the tabulated F-value, at a level of significance α with 2

and n-2 degrees of freedom. When using eq. 4 on a data set generated from

the individual analysis of a certain type of amine (from now on, individual

data set), the value of n indicates the number of different concentration

samples (five in this work). If on the other hand, the data set contains data

from the simultaneous analysis of the different amines considered (from

now on, global data sets), variable n indicates the overall number of

samples analyzed (number of samples x number of analytes). Only if the

theoretical point falls inside the elliptical joint confidence region delimited

by eq. 4 can it be concluded that there are no significant differences

between the two methodologies.

The size of the joint confidence region for a given level of

significance α, depends directly on the estimate of the experimental error.

In this way, when few experimental data are available, the values of are

usually overestimated [18]. This increase in uncertainty is due to the lack of

information inherent to a small number of data pairs, or in some cases to

the lack of fit of the experimental data to the BLS regression line [19]. In

these cases the joint confidence region is oversized. This may prevent a

possible bias from being detected in the method being tested, because there

is a higher probability that the theoretical point (0,1) falls within the joint

confidence interval. In other words, in these situations there is a higher

probability of committing a β error when applying the joint confidence

interval test [20].

Evaluating bias

An earlier study assessed the correctness of the joint confidence

interval test for detecting bias in method comparison studies considering

uncertainties in both axes and one single analyte at different concentration

levels [15]. New studies based on simulated data generated using the


158

Monte Carlo method [21,22] and reproducing typical results from

multianalyte determination methods have been carried out to show the

correctness of the joint confidence interval test when detecting bias in

multianalyte determination methods. Moreover real data sets have also

been used to provide application examples.

In this way, for both real and simulated data sets, a bias was detected in the

method being tested when the theoretical point (0,1) falls outside the joint

confidence region at a given level of significance α. The conclusions drawn

about bias from the global data sets were compared with the conclusions

drawn from the joint confidence interval test applied to individual data sets

(i.e. those that only contain data corresponding to the single analytes). In

this latter case, the results from the method being tested were considered

biased when for more than the half of the analytes (four or more) the

theoretical point (0,1) falls outside the joint confidence region at the same

level of significance α.


Eight biogenic amines were initially considered in this research

work: histamine, methylamine, tyramine, ethylamine, phenethylamine, 1,4-

diaminobutane (putrescine), 1,5-diaminopentane (cadaverine) and 3-

methylbutylamine. They were all perfectly resolved by both OPA-

derivatization methods for the different concentration samples. Figures 1a

and 1b show the chromatograms obtained with both methods for a 3 ppm

standard addition in a red wine sample. As can be seen, the eight peaks

corresponding to the eight amines appear perfectly resolved along with

other compounds which do not interfere in the analysis. Figure 1c shows

the chromatogram obtained when the same 3 ppm standard addition

sample was analysed using the AQC-derivatization method. As can be

seen, putrescine is partially overlapped with an interfering compound


159

when the analysis was performed with this method. This overlap did not

allow putrescine to be quantified by the AQC-derivatization method. For

this reason this analyte was not considered in the subsequent evaluation of

bias between OPA precolumn and AQC derivatization methods. Moreover

the low peak resolution for the tyramine, phenetilamine and 3-

methylbuthylamine shown by the AQC-derivatization method (Fig. 1c) is

mostly due to the modification of the gradient program considering the

relatively high number of analyses that had to be carried out.

min0 5 10 15 20 25

8000

10000

12000

14000

16000 1

2

34

5

6 78

(a)

16000

14000

12000

10000

8000

1

2

3

4

5

6

7

8

(b)

Figure 1. Chromatograms for the OPA precolumn (a) and OPA oncolumn (b) derivatization methods in the analysis of the biogenic amines in 3 ppm standard addition red wine samples.


160

min0 2 4 6 8 10 12 14 16 18

10000

20000

30000

40000

50000

60000

2

1

4

3

6

8

5

7

(c)

1.- Histamine. 5.- Phenethylamine. 2.- Methylamine. 6.- Putrescine. 3.- Tyramine. 7.- 3-methylbutylamine. 4.- Ethylmine. 8.- Cadaverine.

Figure 1 (cont). Chromatogram from the AQC (c) derivatization method in the analysis of the biogenic amines in 3 ppm standard addition red wine samples.

Chemicals and reagents

All the biogenic amines were supplied by Aldrich-Chemie (Beerse,

Belgium). An individual standard solution of 2000 mg l-1 of each amine

was prepared in HPLC-grade acetonitrile (Scharlau, Barcelona, Spain) and

stored in darkness at 4oC. A working standard solution containing all the

amines was prepared with an aliquot of each solution and subsequently

diluted with synthetic wine (3.5 g of tartaric acid in a 12% hydroalcoholic

solution, and the pH of the solution adjusted to 3.5) in a volumetric flask.

More diluted solutions used in the different studies were prepared by

diluting this standard solution with the synthetic wine.

The Milli-Q quality water (Millipore, Bedford, USA) used in the

chromatographic experiments was filtered through a 0.45 µm nylon

membrane. The methanol, tetrahydrofurane and sodium acetate used to

prepare the mobile phases were of HPLC grade (Scharlau).


161

For the automatic derivatization methods, the AccQ·Fluor Reagent

Kit (Waters, Milford, MA, USA) and o-phtalaldehyde/mercaptoethanol

(Aldrich) were used as described in previous studies [10,23].

Equipment

Chromatographic experiments were performed using a Hewlett-

Packard (Waldbronn, Germany) 1050 liquid chromatograph with a Hewlett

Packard model 1046A fluorescence detector. In the precolumn

derivatization methods, the samples were derivatized and injected with a

Hewlett Packard Series 1050 automatic injector. Separation of the amine

derivatives was performed using an ODS Basic cartridge (250 x 4.6 mm i.d.,

particle size 5 µm) supplied by Hewlett Packard. In the OPA oncolumn

derivatization method, separation was performed using an Asahipack OP-

50 cartridge (250 x 4.6 mm i.d., particle size 5 µm) also supplied by Hewlett

Packard.

High-performance liquid chromatographic methods

For the oncolumn derivatization method, three solvent reservoirs

containing the following eluents: (A) ACN, (B) 5 mM borate solution (pH 9)

with 12 mM OPA-NAC and (C) 5 mM borate solution (pH 9) with 1% THF

were used to separate all the amines. The HPLC gradient elution is

described below:

OPA oncolumn gradient program

This is a slight modification of the program presented in a previous

study [11]. It began with an isocratic elution of 16% of solution A and 74%

of solution C for 10 minutes, followed by linear gradient elution from these

percentages to 21% and 79% in 30 seconds. This composition was


162

maintained for 17 minutes before changing to a 23.5% and 76.5% of

solutions A and C in 30 seconds. Finally, another isocratic elution with this

latter composition was applied until minute 40. Determination was

performed at 40ºC with a flow-rate fixed at 0.8 ml·min-1. The eluted

derivatives were detected by monitoring their fluorescence using 340 nm

and 450 nm as the excitation and emission wavelengths, respectively.

Under these conditions all eight amines were eluted in under 40 minutes.

Both precolumn derivatization methods used the same mobile

phases and consisted of two solvent reservoirs which contained (A) 0.05M

sodium acetate in 1% THF and (B) methanol. Nevertheless, the gradients

used were different and adjusted as described below:

OPA precolumn gradient program

This is a modified proceeding from the one reported in [8] in order

to adequate the analysis time according to the number of samples analysed

of the eight biogenic amines. The computer program started with 40% of

methanol in the mobile phase and finished 30 minutes later with a 100% of

this solvent. Finally, the column was cleaned with an isocratic elution at

this percentage of methanol for 2 more minutes. Determination was

performed at 60oC with a flow-rate of 1 ml·min-1 and the eluted OPA-

derivatives were detected by monitoring their fluorescence at excitation

and emission wavelengths of 330 nm and 445 nm, respectively. Under these

conditions all seven amines were eluted in less than 25 minutes.

AQC gradient program

This program [10] consisted of a linear gradient elution from 35% to

100% of methanol in 13 minutes. Then, the column was cleaned up by

eluting 100% of methanol for 2 more minutes. The eluted AQC-derivatives

were detected by monitoring their fluorescence using excitation and


163

emission wavelengths of 250 nm and 395 nm respectively. The flow rate

was set at 1 ml·min-1. Under these conditions all seven amines were eluted

in less than 15 minutes.

Derivatization

Both precolumn derivatization methods were fully automated by

means of two injector programs. The derivatization reagents and the

samples were drawn sequentially into the injection needle. The reactants

were then mixed, injected into the column and separated using the gradient

elutions described above. On the other hand, the derivatization in the OPA

oncolumn method was performed by adding OPA-NAC to the mobile

phase, as explained above.

Samples

In order to ensure that the possible differences between the methods

were due to the experimental methodologies rather than to the samples,

identical groups of samples of red, rosé and white wines from different

zones of Tarragona were prepared for the three methods. The procedure

developed was as follows. Three wines were chosen, one red, one rosé and

other white. A stock solution containing 100 ppm of the amines in synthetic

wines was prepared. The samples to be injected were prepared in 25 ml

volumetric flasks by adding 0, 250, 750, 1,250, 1,750 and 2,500 µl of the

working standard solution and bringing to volume with the red, the rosé

and the white wine in each case. In this way, the final 18 solutions of 25 ml

each had a biogenic amine concentration between 1 and 10 ppm. Six

aliquots of 1 ml were taken from each one of the volumetric flasks and

frozen. Finally, they were injected on alternate days, and analysed

according to the methodologies described.


164

Calibration experiments

In order to verify the linearity of the response of the different

derivatives at the previously specified wavelengths for the working

concentrations (0.5 to 15 mg·l-1), standard solutions of amines were

prepared in synthetic wine. Calibration curves for each amine were

constructed by plotting the amine peak-area against the amine

concentration. As in previous studies [8,10,11], linear least-squares

regression was used to calculate the calibration parameters.


Simulated data sets

Three simulated global data sets were used to prove the correctness

of the validation technique for multianalyte determination methods based

on the joint confidence interval test. Two of them simulated a situation in

which two multianalyte determination methods provide identical results.

In the other one results from the method being tested were chosen to be

biased in comparison to the results from the reference method. Moreover

the individual uncertainties associated to the data pairs in each simulated

global and individual data set were different; homoscedasticity,

proportional heteroscedasticity and random heteroscedasticity were

considered.

Real data sets

Four real data sets were used to show the different conclusions

reached about the correctness of a multianalyte determination method

when considering the experimental data for all the analytes at the different

concentration levels jointly (i.e. global data sets) or independently (i.e.

individual data sets).


165

Data Sets 1 and 2. Made up of the results obtained from analysing

seven biogenic amines (histamine, methylamine, tyramine, ethylamine,

phenethylamine, 3-methylbutylamine and cadaverine) in red (data set 1)

and white (data set 2) wines spiked at five different concentration levels.

The seven individual data sets were each composed of five data pairs (Figs.

2a-2g for red and 3a-3g for white wines respectively) and the global data

set was built up by joining the individual data sets (Figs. 2h for red and 3h

for white wines respectively). In this way, the global data set contained 35

data pairs distributed along a linear range between 0 and 14 ppm. Two

different RP-HPLC derivatization procedures were used as analytical

methods in this comparative study; OPA precolumn as the reference

method and AQC precolumn as the method to be tested. The uncertainties

present in both axes were a result of a six replicate analysis at each data

pair.

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

b) MethylamineOPA Precolumn

AQ

C P

rec o

l um

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

a) HistamineOPA Precolumn

AQ

C P

reco

l um

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

c) TyramineOPA Precolumn

AQ

C P

r ec o

lum

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

d) EthylamineOPA Precolumn

AQ

C P

r eco

lum

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

e) PhenethylamineOPA Precolumn

AQ

C P

reco

l um

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

f) i-AmylamineOPA Precolumn

AQ

C P

rec o

l um

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

g) CadaverineOPA Precolumn

AQ

C P

r ec o

lum

n

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

h) Global data setOPA Precolumn

AQ

C P

r eco

lum

n

Figure 2. Data sets obtained from analysing the seven biogenic amines in red wines using the AQC and OPA precolumn derivatizing methods. Individual uncertainties from the six replicate analysis of each sample by both methods are symbolized as the horizontal and vertical lines around the dots that represent the mean values.


166

0 2 4 6 8 10 120

2

4

6

8

10

12

a) HistamineOPA Precolumn

AQ

C P

r eco

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

b) Methylamine

AQ

C P

r eco

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

c) TyramineOPA Precolumn

AQ

C P

r ec o

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

d) EthylamineOPA Precolumn

AQ

C P

r ec o

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

e) PhenethylamineOPA Precolumn

AQ

C P

r eco

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

f) i-AmylamineOPA Precolumn

AQ

C P

r eco

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

g) CadaverineOPA Precolumn

AQ

C P

r ec o

lum

n

0 2 4 6 8 10 120

2

4

6

8

10

12

h) Global data setOPA Precolumn

AQ

C P

r ec o

lum

n

Figure 3. Data sets obtained from analysing the seven biogenic amines in white wines using the AQC and OPA precolumn derivatizing methods. Individual uncertainties from the six replicate analysis of each sample by both methods are symbolized as the horizontal and vertical lines around the dots that represent the mean values.

Data Sets 3 and 4. Because putrescine was perfectly resolved by both

OPA precolumn (reference method) and OPA oncolumn (method to be

tested) RP-HPLC derivatizing methods, all eight biogenic amines could be

analysed in this comparison study in rosé (data set 3) and red wines (data

set 4). For this reason, the global data set for data set 3 (Fig. 4i) and for data

set 4 (Fig. 5i) consisted of forty data pairs from the eight individual data

sets (Figs. 4a-4h for rosé and 5a-5h for red wines respectively) in which five

levels of concentration were considered. The linear range spans from 0 to

14 ppm. The uncertainties for each data pair in both axes were generated

from a six replicate analysis with each method.


167

2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

AQ

C P

reco

lum

n

a) Histamine

0 1 2 3 4 5 60

1

2

3

4

5

6

7

OPA Precolumn

OPA

On-

colu

mn

b) Methylamine

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

c) Tyramine

0 2 4 6 8 10 12 140

2

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

d) Ethylamine

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

e) Phenetylamine

0 2 4 6 8 101

2

3

4

5

6

7

8

9

10

11

OPA Precolumn

OPA

On-

colu

mn

g) Cadaverine

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

f) i-Amylamine

0 2 4 6 8 10 12 140

2

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

h) Putrescine

0 2 4 6 8 10 12 140

2

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

i) Global data set

2 4 6 8 10 12 142

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

a) Histamine

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

OPA Precolumn

OPA

On-

colu

mn

b) Methylamine

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

c) Tyramine

2 4 6 8 10 12 140

2

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

d) Ethylamine

0 2 4 6 8 10 120

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

e) Phenetylamine

0 2 4 6 8 10 12 140

2

4

6

8

10

12

OPA Precolumn

OPA

On-

colu

mn

f) i-Amylamine

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

10

OPA Precolumn

OPA

On-

colu

mn

g) Cadaverine

0 2 4 6 8 10 12 142

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

h) Putrescine

-2 0 2 4 6 8 10 12 140

2

4

6

8

10

12

14

OPA Precolumn

OPA

On-

colu

mn

i) Global data set

Figures 4 and 5. Data sets obtained from analysing the eight biogenic amines in rosé and red wines respectively using the OPA oncolumn and OPA precolumn derivatizing methods. Individual uncertainties from the six replicate analysis of each sample by both methods are symbolized as the horizontal and vertical lines around the dots that represent the mean values.

All the computational work was performed with home made




168


Simulated data sets

Our results have shown that the joint confidence interval test

provides correct results when different analytes at different concentration

levels (i.e. global data sets) are considered. In addition, when bias was

detected in biased global data sets, it was not often detected in most of the

corresponding individual data sets. In this way, when the method being

tested is used to determine various analytes simultaneously, conclusions

about the presence of bias may be wrong if the joint confidence interval test

is only applied on data sets which contain single analyte data. This is

because overestimated values of the experimental error are likely when

few experimental data is considered. This makes the joint confidence region

too large and thus increases the probability of not detecting the existing

bias. Simulation results are available on request.

Real data sets

Data Set 1. The results of applying the joint confidence interval test

to the individual data sets show significant differences between both

methods only for the histamine (Fig. 6a) and the phenethylamine (Fig. 6e)

at a level of significance of 5%. That is, there are no significant differences

between the two methodologies tested (the theoretical point (0,1) falls

inside the joint confidence region) when determining five of the seven

biogenic amines tested in red wines. So because for most of the single

analytes tested bias was detected, it could be concluded from an individual

testing approach, that the RP-HPLC multianalyte determination method

using AQC provides correct results when simultaneously analysing the

seven biogenic amines in red wines. On the contrary, when the joint

confidence interval test is applied on the global data set (Fig. 6h) for the

same level of significance stated above, bias between both methods is


169

detected. The high distance between the theoretical point (0,1) and the

boundary of the joint confidence region observed in Figure 6h, indicates

that results from the RP-HPLC derivatization methodology using AQC

have important bias. For this reason, the experimenter needs to review the

AQC derivatization methodology in search for possible errors, so that bias

in the analytical results can be reduced.

Results from the joint confidence test show that to check the

presence of bias in a multianalyte determination method all the

experimental data should be considered. This is because when more data is

handled better estimates of the experimental error are obtained. This avoids

the oversizing of the joint confidence interval region due to high

experimental random errors and thus the probability of not detecting the

presence of an existing bias (i.e. β error).

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

Intercept

Slo

pe

(0,1)

-2 -1.5 -1 -0.5 0 0.5 10.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Intercept

Slo

pe

(0,1)

( , )ab

( , )ab

-2.5 -2 -1.5 -1 -0.5 0 0.5 10.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Intercept

Slo

pe

(0,1)

( , )ab

-0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.40.96

0.98

1

1.02

1.04

1.06

1.08

Intercept

Slo

pe

( , )ab

(0,1)

-3 -2 -1 0 1 2 30.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

Intercept

Slo

pe

( , )ab

(0,1)

-4 -3 -2 -1 0 1 20.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

Intercept

Slo

pe

-3 -2 -1 0 1 2 3 40.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Intercept

Slo

pe

-0.5 0 0.5 10.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Intercept

Slo

pe

(0,1)

( , )ab

( , )ab

( , )ab

(0,1)

(0,1)

a) Histamine b) Methylamine c) Tyramine d) Ethylamine

e) Phenethylamine f) i-Amilamine g) Cadaverine h) Global data set

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Intercept

Slop

e

Figure 6. Joint confidence regions for the BLS regression coefficients spanned for the data sets obtained from analysing red wines respectively using AQC and OPA precolumn derivatization methods. The level of significance was set at 5 % in all cases.


170

-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.20.8

0.85

0.9

0.95

1

1.05

1.1

(0,1)

( , )ab

a) HistamineIntercept

Slop

e

-1.2 -1 -0.8 -0.6 -0.4 -0.2 00.8

0.9

1

1.1

1.2

1.3

1.4

(0,1)

( , )ab

b) MethylamineIntercept

Slop

e

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.50.2

0.4

0.6

0.8

1

1.2

1.4

( , )ab

(0,1)

c) TyramineIntercept

Slop

e

-1 -0.5 0 0.5 1 1.50.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

( , )ab

(0,1)

d) EthylamineIntercept

Slop

e

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.50.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

(0,1)

( , )ab

e) PhenethylamineIntercept

Slop

e

-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

(0,1)

( , )ab

f) i-AmilamineIntercept

Slop

e

-1.5 -1 -0.5 0 0.5 1 1.50.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

( , )ab

(0,1)

g) CadaverineIntercept

Slop

e

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.40.9

0.95

1

1.05

1.1

( , )ab(0,1)

h) Global data setIntercept

Slop

eFigure 7. Joint confidence regions for the BLS regression coefficients spanned for the data sets obtained from analysing white wines respectively using AQC and OPA precolumn derivatization methods. The level of significance was set at 5 % in all cases.

Data Set 2. The joint confidence interval test applied to individual

analytes shows that there are statistical differences between the two

chromatographic methods at a 5% level of significance only for the

methylamine (Fig. 7b). In this way, as explained in data set 1, no bias would

be detected in the method being tested considering single analyte data in

white wines. Likewise, looking at the joint confidence region for the global

data set (Fig. 7h), neither is bias detected (the theoretical point (0,1) falls

inside the joint confidence region) when comparing the two methods using

linear regression taking into account all the analytes and the errors

associated to their determination. So in this example it could be concluded

that the RP-HPLC method using AQC provides correct results when

simultaneously analysing the seven biogenic amines in white wines.


171

Data Set 3. Only in two of the eight biogenic amines analysed in

rosé wines (histamine and cadaverine), was bias detected for a level of

significance of 5% (Figs. 8a and 8g). As in data set 1, we may conclude from

these results that the RP-HPLC method being tested using oncolumn OPA

derivatization provides correct results when analysing the eight biogenic

amines in rosé wines for a level of significance of 5%, if the experimental

data is considered separately. On the other hand, as in data set 1, if the joint

confidence interval test is applied to all the experimental data available,

bias is detected in the OPA oncolumn derivatization method for the same

level of significance (Fig. 8i). This confirms that overestimated values of the

experimental error 2s , often obtained in data sets with a low number of

data pairs, can generate oversized joint confidence regions and prevent the

bias in the method from being accurately detected. This example, like the

one in data set 1, shows that if we had finally concluded that the method

being tested provided correct results in the simultaneous analysis of the

biogenic amines in rosé wines, we would have accepted a biased

multianalyte determination method (i.e. β error).

In this case however, although significative differences between the

results from both methods have been found when simultaneously

analysing the eight amines in rosé wines, the experimenter may not decide

to revise the OPA oncolumn derivatization methodology. This is because in

this case the distance between the points (0,1) and the boundary of the joint

confidence region is small. In this way, the experimenter may find

acceptable to set a lower level of significance to the initial 5%, for which no

bias would be detected in the experimental results. In such case, the

experimenter should be aware of the consequences of setting low levels of

significance in terms of an increase of the probabilities of committing a β

error.


172

-2 -1.5 -1 -0.5 0 0.50.9

1

1.1

1.2

1.3

Intercept

Slop

e

-0.2 0 0.2 0.4 0.6 0.8

0.9

1

1.1

Intercept

Slop

e

-1.5 -1 -0.5 0 0.5 1 1.5 2

0.9

1

1.1

1.2

Intercept

Slop

e

-3 -2 -1 0 1 2

0.8

1

1.2

1.4

Intercept

Slop

e

-0.2 0 0.2 0.4 0.6

0.9

0.95

1

Intercept

Slop

e

a) Histamine b) Methylamine c) Tyramine d) Ethylamine e) Phenetylamine

-1 -0.5 0 0.5 1

0.9

1

1.1

Slop

e

-0.5 0 0.5 1 1.5

0.8

1

1.2

1.4

-0.5 0 0.5 1 1.5

0.9

1

1.1

InterceptSl

ope

-0.1 0 0.1 0.2 0.3 0.4 0.5

0.96

1

1.04

Intercept

Slop

e

Intercept Intercept

f) i-Amylamine g) Cadaverine h) Putrescine i) Global data set

(0,1)

(0,1)

(0,1)

(0,1)

(0,1)

(0,1)

(0,1)

(0,1)

(0,1)

)ˆ,ˆ( ba)ˆ,ˆ( ba

)ˆ,ˆ( ba

)ˆ,ˆ( ba

)ˆ,ˆ( ba)ˆ,ˆ( ba

-1.5 -1 -0.5 0 0.5

1

1.04

1.08

1.12

1.16

Intercept

Slop

e

-1 -0.5 0 0.5 1 1.5 20.4

0.6

0.8

1

1.2

1.4

1.6

Intercept

Slop

e

-1 -0.5 0 0.5 1 1.5 2

0.8

0.9

1

1.1

1.2

Intercept

Slop

e

-6 -4 -2 0 2 4 6

0.6

1

1.4

1.8

Intercept

Slop

e

-3 -2 -1 0 1 20.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

InterceptSl

ope

a) Histamine b) Methylamine c) Tyramine d) Ethylamine e) Phenetylamine

-0.5 0 0.50.8

0.85

0.9

0.95

1

1.05

1.1

Intercept

Slop

e

0 0.2 0.4 0.6 0.8 10.75

0.8

0.85

0.9

0.95

1

1.05

1.1

Intercept

Slop

e

0 0.5 1 1.5 20.85

0.9

0.95

1

1.05

Intercept

Slop

e

-0.4 -0.2 0 0.2 0.4 0.6

0.9

0.95

1

Intercept

Slop

e

f) i-Amylamine g) Cadaverine h) Putrescine i) Global data set

(0,1)

)ˆ,ˆ( ba (0,1)

)ˆ,ˆ( ba (0,1)

(0,1)

)ˆ,ˆ( ba

(0,1)

)ˆ,ˆ( ba

(0,1)

)ˆ,ˆ( ba

(0,1)

(0,1)

)ˆ,ˆ( ba(0,1)

)ˆ,ˆ( ba

Figures 8 and 9. Joint confidence regions for the BLS regression coefficients spanned for the data sets obtained from analysing rosé and red wines respectively using OPA precolumn and OPA oncolumn derivatization methods. The level of significance was set at 5 % in all cases.

Data Set 4. As in data set 3, bias was detected in only two of the

eight biogenic amines (cadaverine and putrescine) for a level of significance

of 5% (Figs. 9g and 9h). We would therefore conclude that there is no bias

in the results from the OPA oncolumn RP-HPLC derivatizing method if the

results of analysing the eight biogenic amines in red wines are treated

separately. In this example, like in data set 2, this conclusion is confirmed


173

when the joint confidence interval test is applied to the results from the

analysis of the eight amines taken simultaneously (Fig. 9i), because the

theoretical point (0,1) falls inside the joint confidence region at the same

level of significance. We may therefore conclude that the RP-HPLC method

using oncolumn OPA derivatization produces correct results when the

eight biogenic amines in red wines are simultaneously analysed.

CONCLUSIONS

The detection of bias is an important step in the process of

validating an analytical method. Bias in methods that provide

simultaneous results on multiple analytes at different concentration levels

can be detected with regression analysis using bivariate least squares (BLS)

and the joint confidence interval test. Systematic errors should be evaluated

considering all the data pairs produced by the two methods when

determining all the analytes. Otherwise, there are more probabilities for the

multianalyte methodology to be erroneously interpreted as being valid (i.e.

β error) when the joint confidence interval test is applied to single analyte

data.

The probability of committing a β error is higher when there are few

data pairs in a data set, because in these cases there is a higher probability

for the value of the experimental error 2s to be overestimated. This

generates oversized joint confidence regions what provides a higher

probability for the theoretical point zero intercept and unity slope to fall

inside the joint confidence region and, therefore, a higher probability of

falsely accepting biased methods. This clearly shows that the joint

confidence interval test should be applied on data sets containing all the

information from the different analytes determined by the analytical

methodologies.


174

The application of the joint confidence interval test to the results of

analysing the seven biogenic amines (histamine, methylamine, tyramine,

ethylamine, phenethylamine, 1,5-diaminopentane (cadaverine) and 3-

methylbutylamine) in wines with the AQC precolumn derivatization RP-

HPLC method reveals that bias was detected when analysing red wines,

but not when analysing white wines. On the other hand, 1,4-

diaminobutane (putrescine) could be quantified using the OPA oncolumn

derivatization method. For this reason the joint confidence interval test

could be applied to the results of analysing all eight biogenic amines. This

showed that bias was detected when analysing rosé wines, but not when

analysing red wines.

Despite the suitability of the present approach which is based on the

regression technique considering errors in both methods, researchers

should be aware of two weaknesses. The first, inherent in any BLS

regression technique, is that the uncertainties of all the results of the

analysis need to be known [24]. The second is that the regression technique

is not very robust in the presence of outliers with low individual

uncertainty. This limitation of the BLS regression method will be addressed

in future works.

ACKNOWLEDGMENTS

We would like to thank the DGICyT (project no. BP96-1008) and the

CICyT (project no. ALI97-0765) for financial support, and the Rovira i

Virgili University for providing a doctoral fellowship to A. Martínez.

BIBLIOGRAPHY

[1]. J. Stratton, R. Hutkins and S. Taylor. J. Food Protec. 54 (1991) 460.

[2]. P.L. Rogers and W. Staruszkiewicz. J. AOAC Internat. 80 (1997) 591.

[3]. Bonilla, L.G. Enríquez and H.M. Nair. J. Chromatogr. Sci. 35 (1997) 53.


175

[4]. W.J. Hurst. J. Liq. Chromatogr. 13 (1990) 1.

[5]. O. Busto, J. Guasch and F. Borrull. J. Internat. Sci. Vigne Vin 30 (1996) 85

[6]. P. Lehtonen. Am. J. Enol. Vitic. 47 (1996) 127.

[7]. P. Lehtonen, M. Saarinen, M. Vesanto and M.L. Riekkola. Z Lebensm

Unters Forsch 194 (1992) 434.

[8]. O. Busto, M. Mestres, J. Guasch and F. Borrull. Chromatographia 40

(1995) 404.

[9]. M.J. Pereira and A. Bertrand. Bull. OIV 765-766 (1994) 918.

[10]. O. Busto, J. Guasch and F. Borrull. J. Chromatogr. 737 (1996) 205.

[11]. O. Busto, M. Miracle, J. Guasch and F. Borrull. J. Chromatogr. 757

(1997) 311.

[12]. Eurachem/Welac Guide 1. Accreditation of Chemical Laboratories.

Laboratory of the Government Chemist, London 1993.

[13]. ISO 5725-6: 1994(E). Geneva. 1994.

[14]. J. Mandel and F. J. Linnig. Anal Chem. 29 (1957) 743.

[15]. J. Riu and F. X. Rius. Anal. Chem.; 68 (1996) 1851.

[16]. J. M. Lisý, A. Cholvadova and J. Kutej. Comput Chem.; 14 (1990) 189.

[17]. J. Riu and F. X. Rius. J. Chemom. 9 (1995) 343.

[18]. G.J. Hahn and W. Q. Meeker. Statistical Intervals, a guide for

practitioners; John Wiley & Sons: New York, 1991; p. 39.

[19]. A. Martínez, J. Riu and F. X. Rius. in preparation

[20]. A. Martínez, J. Riu and F. X. Rius. in preparation

[21]. P. C. Meier , R. E. Zund Statistical Methods in Analytical Chemistry;

John Wiley & Sons: New York, 1993; pp. 145-150.

[22]. O. Güell, J.A. Holcombe, Analytical applications of Monte Carlo

techniques, Anal. Chem. 60 (1990) 529A - 542A.

[23]. O. Busto, J. Guasch and F. Borrull. J. Chromatogr. 718 (1995) 309.

[24]. R. J. Carroll and D. Ruppert. Amer. Stat. 50 (1996) 1.


176

4.4 Probabilitat d’error β en el test conjunt

Com s’ha posat de manifest anteriorment, cometre un error β en la

comparació dels resultats de dos mètodes analítics pot portar a considerar

que el mètode analític candidat és traçable al mètode de referència, quan en

realitat els resultats del mètode candidat estan esbiaixats. Segons el tipus de

problema analític tractat, les conseqüències d’acceptar un mètode analític

que proporciona resultats esbiaixats poden arribar a ser molt greus. En

aquests casos serà preferible fer més mesures experimentals per aconseguir

una millor estimació de l’error experimental s2 i per tant, córrer un menor

risc d’acceptar un mètode candidat que pot donar resultats esbiaixats.

És per aquest motiu que s’ha desenvolupat un mètode de càlcul per

estimar la probabilitat de cometre un error β quan s’aplica el test de

confiança conjunta sobre els coeficients de regressió BLS. La concepció

teòrica de les probabilitats de cometre un error β en el cas del test conjunt

és anàloga a la presentada pels tests individuals en el capítol 3. És

important tenir en compte que en aquest cas no es parteix d’un interval de

confiança individual amb una sola dimensió, sinó que l’interval de

confiança conjunta té dues dimensions. Així doncs, serà necessari trobar

l’expressió matemàtica que caracteritza la distribució de confiança conjunta

tridimensional associada als coeficients de regressió teòrics que defineixen

tant H0 com H1 (apartat 4.5). Aquesta distribució de confiança conjunta és

anàloga a la distribució t de Student utilitzada en els tests individuals i

relaciona en un espai tridimensional tots els nivells de significança de 0 a 1,

amb els valors dels coeficients de regressió que satisfan l’equació 9 de

l’apartat 4.5. D’aquesta manera la probabilitat de cometre un error β no

vindrà determinada per una àrea com en els tests individuals, sinó per un

volum.

El fonament teòric necessari per entendre la probabilitat de cometre

un error β en el test conjunt es detalla en la secció Probabilities of α and β

4.4 Probabilitat d’error β en el test conjunt

177

error in the joint confidence interval test de l’apartat 4.5. A més, es presenta

l’expressió de la integral triple que en permet l’estimació (equació 12) i es

demostra mitjançant conjunts de dades simulats que les estimacions de les

probabilitats d’error β són correctes. Finalment, la probabilitat de cometre

un error β en aplicar el test conjunt sobre els coeficients de regressió BLS

també s’estima en conjunts de dades reals.


178

4.5 Evaluating bias in method comparison studies using

linear regression with errors in both axes (Journal of

Chemometrics, accepted for publication).

Àngel Martínez*, Jordi Riu and F.Xavier Rius




KEYWORDS

Method bias, probability of β error, method comparison, linear regression,

errors in both axes.

ABSTRACT

This paper presents a theoretical background for estimating the

probability of committing a β error when checking the presence of method

bias. Results obtained at different concentration levels from the analytical

method being tested are compared by linear regression with the results

from a reference method. Method bias can be detected by applying the joint

confidence interval test to the regression line coefficients from a bivariate

least squares (BLS) regression technique. This finds the regression line

considering the errors in the two methods. We have validated the

estimated probabilities of a β error by comparing them with the

experimental values from twenty-four simulated data sets. We also

compared the probabilities of β error estimated using the BLS regression

method on two real data sets with those estimated by using ordinary least

squares (OLS) and weighted least squares (WLS) regression techniques for

a given level of significance α. We found that there were important

4.5 Evaluating bias in method comparison studies ...

179

differences in the values predicted with WLS and OLS compared to those

predicted with the BLS regression method.

INTRODUCTION

Assessing the accuracy of a new analytical method,1 is an essential

part of method validation studies. This can be done, for example, by

comparing the results of the method being tested with those from a

reference method. Results for a particular analyte, obtained at different

concentration levels, can be evaluated by linear regression. The results from

the method being tested (usually on the y axis) are normally regressed onto

those of the reference method (usually on the x axis). The methodology

being tested will be considered correct over the specified range only if the

slope and the intercept of the straight line are not statistically different from

their reference values of unity and zero respectively. This can be checked

using the joint confidence interval test for the slope and the intercept.2

So far, the most common regression techniques for finding the

regression line between the results of the two methods are ordinary least-

squares (OLS) and weighted least-squares (WLS). Both techniques consider

the predictor variable to be error-free. OLS assumes constant errors while

WLS considers nonconstant ones in the response variable. An alternative is

errors-in-variables regression,3-6 known as the constant variance ratio

(CVR) approach, which considers the errors in both axes. It does not take

into account the individual uncertainties of each experimental point but

considers the ratio of the variances of the response and predictor variables

to be constant for every experimental point ( 22xy ss=λ ). A particular case

of the CVR approach, is orthogonal regression (OR),7 in which the errors

are of the same order of magnitude in the response and predictor variable

(i.e. λ=1). However, the best option is to use bivariate least squares

regression techniques (BLS), since they consider the individual nonconstant


180

errors in both axes. Recently, it has been developed a joint confidence

interval test for the slope and the intercept considering the BLS regression

conditions.8

We know that two kinds of errors may arise from hypothesis tests.

Type I (or α) error is the one which is often considered and within the field

of analytical chemistry, deals with wrongly accepting the presence of bias

in the results of the analytical method being tested. Type II (or β) error, on

the other hand, occurs when its presence is wrongly denied. Although it

has not been yet extensively introduced in routine chemical analysis

practices, information about the probability of committing a β error is in

many cases as important, if not more important, than that provided by the

level of significance α. The consequences of introducing bias in future

chemical analysis results as a consequence of erroneously accepting a

biased method may even affect a laboratory’s reputation. Despite its

importance, to our knowledge there is no statistically based foundation for

estimating the probability of β error in method comparison studies. In this

paper we present the theoretical background for estimating these

probabilities when comparing the results of two analytical methods at

multiple levels of concentration and taking into account the individual

errors in both methods (i.e. applying the joint confidence interval test with

the BLS technique). We also validate the results from our theoretical

expressions and include two practical examples using real data. This

technique is not only applicable in method comparison studies but also

when the results from two analysts, laboratories or techniques are to be

compared.

To validate the expressions we used simulated and real data sets.

The simulated data sets, generated by the Monte Carlo method,9,10 were

chosen to reproduce usual data set structures found in real analysis data.

Applying the technique to two real data sets demonstrated that the

probability of β error using OLS and WLS regression techniques can be


181

very different from that from the BLS regression method when errors in

both axes are considered.


Notation

In general, the true values of the different variables used in this

work are represented with Greek characters, while their estimates are

denoted with Latin letters. In this way, the true values of the regression

coefficients are represented by β0 (intercept) and β1 (slope), while their

respective estimates are denoted as a and b. The estimates of the standard

deviations of the regression coefficients are as and bs . The experimental

error, expressed in terms of variance for the n experimental data pairs

(xi,yi), is σ2, while its estimate is 2s . Analogously, iy is the prediction of the

experimental yi.

For the joint confidence interval tests, 0Ha ,

0Hb , 1Ha and

1Hb are the

values of the theoretical regression coefficients that define the points

(0Ha ,

0Hb ) and (1Ha ,

1Hb ), from which the null and alternative hypothesis

(H0 and H1) are postulated. Their respective standard deviation estimates

are 0Has ,

0Hbs and 1Has ,

1Hbs .

Bivariate Least Squares Regression (BLS)

From all the existing least squares approaches for calculating the

regression coefficients when errors in both axes are present, Lisý’s

method11 (referred to as BLS) was found to be the most suitable.12 This

technique assumes the true linear model to be:


182

ii ξββη 10 += (1)

The true variables ξi and ηi are unobservable and instead, one can only

observe the experimental variables:

iiix δξ += (2)

iiiy γη += (3)

The random errors committed in the measurement of variables xi and yi, are

represented by variables δi and γi, where ),0(N~ 2ixi σδ and ),0(N~ 2

iyi σγ .

In this way, introducing eqs. 2 and 3 in eq. 1 and isolating the variable yi,

the following expression is obtained:

iii xy εββ ++= 10 (4)

The term εi is the ith true residual error with ),0(N~ 2ii εσε 13 and can be

expressed as a function of δi, γi and β1:

iii δβγε 1−= (5)

Many authors3,14-16 have developed procedures to estimate the

regression line coefficients based on a maximum likelihood approach

whenever errors in both variables are present. In most cases, these methods

need the true predictor variable to be carefully modelled.16 This is not

usually possible in chemical analysis, where the true predictor variables iξ

are not often randomly distributed (i.e. functional models are assumed).

Moreover there are cases in which the experimental data is heteroscedastic

and estimates of measurement errors are only available through replicate

measurements (i.e. the ratio ii yx σσ is not constant or unknown). These


183

conditions, common in chemical data, make it very difficult to rigorously

apply the principle of maximum likelihood to the estimation of the

regression line coefficients. On the other hand, there is a method to

estimate the regression coefficients using a maximum likelihood approach

even when a functional model is assumed.13 This method is not rigorously

applicable when individual heteroscedastic measurement errors are

considered. It has been shown that when assuming ii yx λσσ = for any i,

least squares methods provide the same estimates of the regression

coefficients as the ones from a maximum likelihood estimation approach.17

For these reasons, we have chosen an iterative least squares method (i.e. the

BLS method) that can be applied on any group of ordered pairs of

observations with no assumptions about the probability distributions.17

This allows the application of this method in real chemical data when

individual heteroscedastic errors in both axes are considered. In this way,

the BLS regression method relates the observed variables xi and yi as

follows:18

iii ebxay ++= (6)

The term ei is the observed ith residual error. The variance of ei is 2ies and

will be referred to as weighting factor. This parameter takes into


axes ( 2ixs and 2

iys ) obtained from replicate analysis. The covariance between

the variables for each (xi,yi) data pair, which is normally assumed to be

zero, is also taken into account:

)(cov2)var()(var 2222

iixyeiii yxbsbssbxayeiii

−+==−−=

(7)


184


coefficients by minimising the sum of the weighted residuals, S, expressed

in eq. 8:

2

12

2

)2()ˆ( sns

yySn

i e

ii

i

−=−

= ∑=

(8)

The experimental error s2 is an important variable since it provides a

measure of the dispersion of the data pairs around the regression line and

can give a rough idea of the lack of fit of the experimental points to the

regression line. In this way, the BLS regression technique assigns less

importance to those data pairs with larger 2ixs and 2

iys values, that is to say,

the most imprecise data pairs. By minimising the sum of the weighted



process.8

Special attention should be paid to the estimation of the variances of

the errors made in the measurement of the different concentration samples

( 2ixs and 2

iys ) by the methods in comparison. To obtain the best possible

estimates of the regression coefficients when no estimates of the error

variances are available from previous experiments, a sufficient number of

replicates should be made. However, measurement error variances

estimated by means of replicate measurements may include sources of

variation that are not related to the random errors made when analysing

the samples.19 This is the case of replicate measurements with different

means, lack of homogeneity (i.e. geological samples) or different kinds of

interferences that affect the analysis of the analyte(s) of interest by both

methods. In such cases, estimates of the regression coefficients from the

BLS method, as with those from other regression methods that consider

measurement errors, may be biased. It has been reported that this bias


185

occurs because these regression methods consider that all the variability in

the data is due to the random errors made when measuring the different

samples.19 They ignore an important component of variability in the data

when the true variables ξi and η i do not follow a linear relationship (i.e. an

error in the equation is not considered) and consequently, the pairs (ξi ,ηi)

do not fall exactly on a straight line. However, linear models with an error

in the equation are not usual in chemical analysis, since instrumental

responses are justified by theoretical laws (Lambert-Beer’s law, Nernst’s

law, ...). Also in method comparison studies, since the two methods being

compared measure the same samples, linear models with an error in the

equation are not theoretically justified.20 Despite the infrequent use in

analytical chemistry of linear models with an error in the equation,

researchers should be aware that by using a regression method that needs

to estimate the variances of measurement errors, the error components that

affect experimental measures should be carefully considered.

Joint Confidence Interval Test

To compare two analytical methodologies by bivariate linear

regression, the results from a set of samples by the method to be tested are

regressed onto those from the already established methodology, with their

respective uncertainties. Different concentration levels are considered in

order to generate the straight line. If neither of the straight line regression

coefficients statistically differs from unity slope and zero intercept, the

results of the two methods will not be considered statistically different at a

given level of significance α.

The joint confidence interval test for the slope and the intercept

considering the errors present in both axes, is used to check the presence of

bias in the method to be tested by applying it to the BLS regression

coefficients.8 In other words, this can be regarded as checking the suitability


186

of the null hypothesis H0, which can be defined depending on the risk of

error (α or β) to be controlled.21 In this paper H0 assumes that both

estimated BLS regression coefficients belong to a joint confidence

distribution centred at the reference point ),(00 HH ba (where 0

0H =a and

10H =b ). Traditionally, no method bias was detected (i.e. H0 was accepted)

when the theoretical point (0,1) for a given level of significance α, fell

within the limits of the joint confidence region centred at the point (a,b).2

However, in order to be more consistent with the definition of hypothesis

testing in terms of accepting or rejecting H0,22 the joint confidence interval

should be centred at the reference point (0,1), so that if the experimental

point (a,b) falls within this joint confidence interval no bias is detected

between the two methods in comparison.

In this way, as the standard deviations associated to any pair of BLS

regression coefficients ( as and bs ) not only depend on the experimental

error 2s but also on the slope value,23 the size of the joint confidence region

centred at the reference point (0,1) will now be set according to the 0Hb

unity value, and not according to the estimated slope b. Considering the

stated modifications, the expression developed in a previous work8 that

spans the joint confidence interval for a given level of significance α, can be

re-defined so that it is centred at the reference point ),(00 HH ba as follows:

∑∑∑=

−−==

=−+−−+−n

in

e

in

i e

in

i e

Fsbbsxbbaa

sxaa

siii 1

)2,2(122

H2

2

HH1

22

H1

2 2)()()(2)(10

0H

00

0H

0

0H

α (9)

where F1-α(2,n-2) is the tabulated F value for a level of significance α with 2

and n-2 degrees of freedom. Values of variables a and b that satisfy eq. 9

define the boundary of the joint confidence interval for a given level of

significance α. The term 0Hies is the weighting factor from eq. 7 recalculated


187

with the 0Hb unity value. The value of the experimental error s2 must be the

one initially estimated for the BLS regression line (with regression

coefficients a and b) because there can be no experimental error associated

to the reference values 0Ha and

0Hb .

Probabilities of α and β error in the joint confidence interval test

The risk of committing a β error is defined as the probability of

accepting H0 when the correct hypothesis is H1.24 This latter hypothesis

considers that the estimated regression coefficients a and b belong to a joint

confidence distribution centred at the point ),(11 HH ba . In a more analytical

context it can be defined as the probability of accepting that the method

being tested provides correct results when in fact these results are biased.

To estimate the probability of β error, it is essential to find the

expression that spans the joint confidence distribution for the slope and the

intercept. The expression has been developed so that the same joint

confidence intervals from eq. 9 can be easily generated for any level of

significance α. For this reason, we have derived an expression that relates

all the possible values of a and b with the level of significance α (i.e. risk of

α error) that satisfies eq. 9 (see Appendix A). This produces a tri-

dimensional joint confidence distribution when the three variables are

represented on the x, y and z axes respectively (Figure 1):

−

−

===

−

−+−−+−

+=α

∑∑∑2

2

2

1

2H2

2

HH1

22

H1

2

)2(

)()()(2)(1

10

0H

00

0H

0

0H

nn

i e

in

i e

in

i e

sn

bbsxbbaa

sxaa

siii (10)


188

-1 0

1 2

0.5

1

1.5

0

0.2

0.4

0.6

0.8

1

InterceptSl

ope

Risk ofα error

Figure 1. Joint confidence distribution for the slope and the intercept as a function of the level of significance α. A tri-dimensional distribution is produced.

The width of this tri-dimensional joint confidence distribution

depends on the standard deviations 0Has and

0Hbs (directly related to the

first and third summation term in eq. 10 respectively). These standard

deviations are estimated considering the reference regression values

00H =a and 1

0H =b and depend on the experimental error s2. The tilt of the

joint confidence region depends on the covariance between a and b.

In the tri-dimensional joint confidence distribution, the probability

of an α error,24 can be illustrated as the volume of the distribution

associated to the reference point (0,1) that falls outside the projection of the

bi-dimensional joint confidence region for a given level of significance α. In

other words, all those pairs of values (a,b) for which method bias would be


189

detected, although they belong to the joint confidence distribution centred

at the reference point (0,1) (Figure 2).

Risk of α error

Joint Confidence Region for a level of significance α

),(11 HH ba

Risk of β error

(0,1)

Intercept

Risk ofα error

Slope

Figure 2. Volume of the joint confidence distribution centred at the reference point (0,1) excluded from the projection of the joint confidence interval for a level of significance α, i.e. risk of α error (dotted section). Volume of the joint confidence distribution centred at the alternative point ),(

11 HH ba overlapped

within the projection of the joint confidence interval for a level of significance α, i.e. risk of β error (shaded section).

On the other hand, the probabilities of β error can be illustrated as

the volume of the tri-dimensional joint confidence distribution centred at

the point ),(11 HH ba (to be set by the experimenter), that falls inside the

projection of the bi-dimensional joint confidence region centred at the

reference point (0,1) for a given level of significance α (Figure 2). That is all

the possible pairs of values for which method bias would not be detected

even though they belong to the joint confidence distribution centred at the

point ),(11 HH ba .

Therefore, the probability of β error not only depends on the size of

the joint confidence region (defined by 0Has and

0Hbs ) centred at the

reference point, but also on the size of the joint confidence distribution


190

centred at the point ),(11 HH ba , defined by the values of

1Has and 1Hbs . As

happens with the dependence of 0Has and

0Hbs on 0Hb and s2, the values of

1Has and 1Hbs depend on the biased slope value

1Hb and s2. This means that

to obtain the correct probability of β error, an accurate estimation of s2 is

necessary. For this reason, a statistical test which detects lack of fit

considering errors in both axes is needed to prevent overestimated s2

values.25

To obtain an initial non-normalised value of the probability of β

error (βprev), a triple integral to calculate the intersected volume can be

defined as:26

∫ ∫ ∫ α=β 1

2

1

2

)(

)(

),(

0prev ddda

a

ab

ab

baab

α (11)

where α(a,b) is the tri-dimensional joint confidence distribution (eq. 10)

centred at the point ),(11 HH ba (i.e. considering

1Ha and 1Hb coefficients

instead of 0Ha and

0Hb in eq. 10). The terms b1(a) and b2(a) are functions of

a that, for intercept values between a1 and a2, define the upper and lower

halves of the elliptical joint confidence region centred at the reference point

(0,1) for a given level of significance α. Also, a1 and a2 are the intercept axis

values where the two halves meet their ends (mathematical expressions are

presented in Appendix B). This βprev value must be normalised by dividing

it by the total volume inside the three-dimensional joint confidence

distribution centred at the point ),(11 HH ba to obtain the usual probability of

β error ranging from zero to one:

∫ ∫∫ ∫ ∫

∞

∞−

∞

∞−

α

α

α=β

abba

aba

a

ab

ab

ba

dd),(

ddd1

2

1

2

)(

)(

),(

0 (12)


191

Validation Process

As can be seen in Appendix A, the expression of the joint confidence

distribution (eq. 10) for the slope and the intercept, used to calculate the

probabilities of β error in eq. 12, is derived from the equation defining the

BLS joint confidence interval (eq. 9). As stated previously, the BLS

technique is an iterative least-squares method, that unlike those based on a

maximum likelihood approach, lacks of a rigorous mathematical

background. In this way, the derived BLS joint confidence interval and thus

the expression for estimating the probabilities of β error should be

validated. For this reason, a validation process has been designed to assess

how correct the estimates of the probability of β error from eq. 12 are. We

used simulated data sets showing uncertainties in both axes, plus two real

data sets to estimate and compare the probabilities of β error under BLS

regression conditions with the probabilities of β error under WLS and OLS

conditions, for a level of significance of 5%.

The simulated data sets were generated by the Monte Carlo method.

In this way, 100,000 new data sets were randomly generated for each initial

data set, where all the data pairs perfectly fitted along a straight line with a

slope and intercept which were significantly different from unity and zero,

respectively (i.e. results from a two-methods comparison study whose

differences are statistically significant), by adding random errors based on

the individual uncertainties of each point to each of the data pairs. Those

simulated data sets in which the experimental point (a,b), although

generated from an initial data set with obvious bias in both slope and

intercept coefficients, fell within the joint confidence interval (eq. 9) for a

given level of significance α, were considered the reference estimate of the

probability of β error. This estimate was then compared to the analogous

value from eq. 9, where the experimental error 2s has been calculated as


192

the mean value of the 100,000 individual experimental errors generated in

the simulation process.


Data Sets and Software

In the validation step we used twenty-four simulated data sets to

reproduce some usual structures of routine analytical measurements. Two

linear ranges were combined with four different numbers of data pairs and

three different kinds of uncertainty patterns, as described below:

Linear Ranges: The short range spans from 0 to 10 units and the

large range spans from 0 to 100 units.

Number of data pairs: Data sets composed of five, fifteen, thirteen

and a hundred data pairs were taken. In all cases the data pairs were

randomly distributed throughout the two different linear ranges.

Uncertainties: Both homoscedastic and heteroscedastic data sets

were used. In the homoscedastic data sets standard deviations on both xi

and yi values were constant. In one heteroscedastic data set standard

deviations increased and in the other, standard deviations were random. In

neither case however, was the standard deviation higher than 15% of each

individual xi and yi value.

Two real data sets were used to show the differences in the

estimates of the probability of β error considering BLS, WLS and OLS

regression methods.

Data Set 1.27 Eight data pairs were generated from the

determination of eight polycyclic aromatic hydrocarbons in various


193

environmental matrices at different concentration levels through a stepwise

interlaboratory study. Uncertainties for each concentration sample were

generated from three replicate analyses. The two analytical methods

compared are high performance liquid chromatography (HPLC) (results on

the x axis) and gas chromatography (GC) (results on the y axis). Eleven

laboratories took part. The linear range spans from eight to twenty six

µg/g. The regression lines generated for this data set with the three

regression techniques are shown in Figure 3a.

Data set 2.28Comparative study on the determination of mercury in

biological tissue using gas chromatography with sodium tetraethylborate

derivatization, coupled to a cold vapour atomic fluorescence spectrometer.

One (x axis) and two (y axis) gold amalgamations were used to reduce

mercury before the transfer to the atomic fluorescence spectrometer. Five

data pairs were obtained with their respective uncertainties generated from

six replicates performed at each concentration level. Units are expressed in

pg of recovered mercury. The data set and the regression lines generated

by the three regression techniques are shown in Figure 3b.

5 10 15 20 25 308

10

12

14

16

18

20

22

24

26

HPLC (mg/g)

GC

(mg/

g)

a)

-200 0 200 400 600 800 1000

0

200

400

600

800

1000

One amalgamation step (pg)

Two

amal

gam

atio

n st

eps

(pg)

b)

Figure 3. OLS (dashed), WLS (dotted) and BLS (solid) regression lines for data set 1 (a) and data set 2 (b).


194

All the computational work involving simulation processes and

integrations for the theoretical estimation of the probability of β error was

done with home-made Matlab subroutines (Matlab for Microsoft Windows

ver. 4.2, The Mathworks, Inc., Natick, MA).


Simulated data sets

Table 1 shows the probabilities of β error from the Monte Carlo

simulation process (column ‘βsim’) and the estimated probabilities using eq.

12 (column ’β’). Columns ‘1Hb ’ and ‘

1Ha ’ show the values of the regression

coefficients defining the point ),(11 HH ba chosen to provide probabilities of β

error similar to the levels of significance α. The linear ranges and kinds of

uncertainty for each data set are summarized in the ‘Range’ and

‘Uncertainty’ columns.

n Range Uncertainty α (2 tails) 1Hb 1Ha βsim β

5 [0-10] homo. 0.1 1.1 -2.0 11.01 17.21 0.05 1.1 -2.7 5.98 10.09 0.01 1.1 -5.5 0.75 2.34 0.001 1.1 -13.3 0.08 0.56 hetero. 0.1 1 -0.5 11.47 16.67 0.05 1.75 -0.5 7.17 14.84 0.01 2.4 -0.5 2.94 5.81 0.001 4.3 -0.5 0.88 2.25 heter. rnd. 0.1 1 -0.25 10.94 16.18 0.05 1.1 -0.45 5.07 8.67 0.01 1.2 -0.9 1.69 3.07 0.001 1.5 -2.3 0.23 0.76 [0-100] homo. 0.1 1.5 -25 16.02 26.57 0.05 1.5 -40 6.72 13.51 0.01 1.5 -67 1.72 4.08 0.001 1.5 -140 0.29 0.95

Table 1. Predicted and experimentally obtained probabilities of β error for the simulated data sets.


195


hetero. 0.1 1.1 -6 9.13 16.32 0.05 1.8 -6 5.66 13.42 0.01 2.4 -6 3.85 6.57 0.001 3.8 -6 2.61 3.90 heter. rnd. 0.1 1.1 -3.3 10.23 15.86 0.05 1.5 -6.3 5.94 11.14 0.01 2.2 -9 2.63 4.83 0.001 3.8 -17 0.758 1.20 15 [0-10] homo. 0.1 0.9 1 10.09 10.02 0.05 0.9 1.15 6.82 6.34 0.01 0.9 1.5 1.99 1.47 0.001 0.9 2 0.314 0.14 hetero. 0.1 0.99 0.02 13.84 14.70 0.05 0.973 0.02 5.51 5.80 0.01 0.96 0.02 1.12 1.02 0.001 0.945 0.02 0.17 0.107 heter. rnd. 0.1 0.99 0.018 10.17 10.98 0.05 0.97 0.023 5.04 5.35 0.01 0.96 0.033 1.09 0.97 0.001 0.95 0.045 0.009 0.17 [0-100] homo. 0.1 0.975 0.03 7.76 8.59 0.05 0.975 2.5 5.1 5.45 0.01 0.975 3.3 0.99 0.94 0.001 0.975 4.2 0.27 0.16 hetero. 0.1 0.98 0.05 12.40 13.04 0.05 0.975 0.05 5.82 6.07 0.01 0.965 0.05 1.21 1.16 0.001 0.95 0.05 0.14 0.07 heter. rnd. 0.1 0.97 0.18 10.56 11.16 0.05 0.96 0.22 6.59 6.70 0.01 0.95 0.3 2.89 2.36 0.001 0.94 0.43 0.53 0.27 30 [0-10] homo. 0.1 1.1 -0.8 10.05 11.74 0.05 1.1 -0.95 3.93 4.90 0.01 1.1 -1.2 0.32 0.74 0.001 1.1 -1.4 0.08 0.25 hetero. 0.1 1.05 -0.015 11.40 12.77 0.05 1.1 -0.015 5.52 6.84 0.01 1.16 -0.015 0.37 0.74 0.001 1.2 -0.015 0.10 0.23 heter. rnd. 0.1 1.05 -0.03 10.56 11.91

Table 1 (cont.). Predicted and experimentally obtained probabilities of β error for the simulated data sets.


196


0.05 1.07 -0.035 4.00 5.16 0.01 1.1 -0.04 1.07 1.81 0.001 1.15 -0.045 0.01 0.11 [0-100] homo. 0.1 1.02 -0.45 15.04 16.03 0.05 1.02 -1.85 5.20 5.79 0.01 1.02 -2.2 2.53 2.62 0.001 1.02 -2.8 0.12 0.18 hetero. 0.1 1.05 -0.15 11.57 12.77 0.05 1.1 -0.15 5.35 6.84 0.01 1.15 -0.15 0.82 1.53 0.001 1.2 -0.15 0.06 0.23 heter. rnd. 0.1 1.05 -0.3 10.57 11.91 0.05 1.07 -0.35 3.99 5.16 0.01 1.1 -0.4 1.01 1.81 0.001 1.15 -0.45 0.02 0.11 100 [0-10] homo. 0.1 0.95 0.43 10.91 11.13 0.05 0.95 0.5 4.48 4.67 0.01 0.95 0.59 1.55 1.59 0.001 0.95 0.7 0.18 0.25 hetero. 0.1 0.99 0.04 9.09 9.25 0.05 0.95 0.04 6.69 6.82 0.01 0.92 0.04 1.28 1.33 0.001 0.89 0.04 0.07 0.07 heter. rnd. 0.1 0.99 0.017 11.38 11.65 0.05 0.97 0.022 4.30 4.55 0.01 0.96 0.028 1.54 1.64 0.001 0.95 0.035 0.35 0.43 [0-100] homo. 0.1 0.99 0.15 9.27 9.63 0.05 0.99 1 4.79 4.96 0.01 0.99 1.2 1.13 1.28 0.001 0.99 1.4 0.24 0.29 hetero. 0.1 0.99 0.28 9.88 10.07 0.05 0.94 0.28 5.55 5.70 0.01 0.91 0.28 1.39 1.41 0.001 0.88 0.28 0.15 0.15 heter. rnd. 0.1 0.99 0.35 12.87 13.14 0.05 0.97 0.45 5.47 5.65 0.01 0.95 0.57 1.80 1.95 0.001 0.93 0.7 0.51 0.51

Table 1 (cont.). Predicted and experimentally obtained probabilities of β error for the simulated data sets.


197

These results show that the probabilities of β error calculated with

eq. 12 are usually overestimated. This is because when few data pairs are

considered, the experimental error s2 (and therefore the values of 0Has ,

0Hbs

and 1Has ,

1Hbs ) is likely to be overestimated.29 This overestimation is clear in

the s2 values from the simulation process used to estimate the probabilities

of β error for the different levels of significance α, since they are the mean

of 100,000 observations. For this reason the overestimation of the

probabilities of β error is clearer in those data sets with five data pairs,

where the amount of information is lower (and hence there is greater

uncertainty). As we can see in Figure 4, the more data pairs there are, the

better the agreement between the values in columns ‘βsim‘ and ‘β‘ (more

information is added). Moreover, the uncertainty pattern in the data and

the linear range do not seem to affect the error in the estimates.

0

20

40

60

80

100

120

5 15 30 100


∆β %

III

III III

IIII

II

II

II

III

I

I

a)

Figure 4. Variation of the error of prediction

×

−=∆ 100

βββ

βsim

sim with the number

of data pairs for a level of significance of 5%, considering homoscedasticity (I), constant (II) and random (III) heteroscedasticity, for the low (a) linear range.


198

0

50

100

150

200

5 15 30 100


∆β %

III

III III

IIII

II

II

IIIII

b)

II

Figure 4 (cont.). Variation of the error of prediction

×

−=∆ 100

βββ

βsim

sim with the

number of data pairs for a level of significance of 5%, considering homoscedasticity (I), constant (II) and random (III) heteroscedasticity, for the high (b) linear range.

Results for data sets with fifteen data pairs (1Hb values of less than

one) are more accurate than those for data sets with thirty (1Hb values of

greater than one), and similar to those for data sets with a hundred data

pairs (Table 1). This is because 1Hb values of less than one were chosen,

which provides underestimated standard deviations 1Hbs and

1Has . This

partially offsets the overestimation of the probabilities of β error caused by

high s2 values because of the uncertainty in the experimental data, which is

more obvious in data sets with fewer data pairs. In this way, the intersected

volume of the distribution centred at the point ),(11 HH ba is lower (the size

of the joint confidence distribution directly depends on 1Hbs and

1Has )

producing more accurate results, although underestimated in some cases.

Finally, as expected, predictions for data sets with a hundred data pairs


199

were the best thanks to the large amount of information provided by the

high amount of data and to the 1Hb values of less than one.

To check whether there were significative differences between the

results in column ‘β’ with the results from the Monte Carlo simulation

process, we carried out a paired t-test1 on the β error values predicted at the

different levels of significance α for each type of data set studied. Although

there were no significant differences between the values in columns ‘βsim‘

and ‘β’ for a level of significance of 5% in any data set, we found the errors

made in estimating the probabilities of β error in data sets with five data

pairs (Figure 4) to be excessive (between 70 % and 145 %). We have

therefore considered that a number of five data pairs is generally too low to

estimate the experimental error s2 correctly, and so produce accurate

estimates of the probabilities of β error. These results show that, if an

accurate estimate of s2 is available, the probabilities of β error from eq. 12

are correct.

Real data sets

Table 2 shows the results of the joint confidence interval test on both

real data sets. Columns ‘a‘ and ‘b’ show the values of the straight line

coefficients for each one of the regression methods (‘Reg. Meth.’ column).

Values of the biased BLS regression coefficients (in columns ‘1Ha ’ and ‘

1Hb ’)

have been set according to the bias we have considered unacceptable to

remain undetected in the regression coefficients. It is important to bear in

mind that there are no theoretical rules for setting these biased coefficients.

The experimenter therefore, needs some experience in order to define the

alternative hypothesis according to the bias that cannot be accepted to

remain undetected in the regression coefficients. The bias depends on the

objective of the comparison study, the kind of analytical method being

tested or other factors. Another important issue to consider when setting


200

the biased regression coefficients that define H1, is the covariance between

the regression line coefficients. This covariance (responsible for the tilt of

the joint confidence distribution) makes that not all the possible pairs of

regression coefficients (a,b) at the same Euclidean distance

( ) ( )

−+− 22

1010 HHHH bbaa from the theoretical point (0,1) have the same

probability of being experimentally observed. We therefore recommend

first setting one of the biased regression coefficients and then calculating

the value of the other according to the direction of the major axis of the

elliptical joint confidence region. In this way, the bias to be detected in the

regression coefficients (defined by the point (1Ha ,

1Hb )) is the most likely to

happen. Therefore the calculated probabilities of committing a β error can

be considered to be the most suitable ones for the experimental data

available.

Data Set Reg. Meth. a b ∆ 1Ha 1Hb s2 β

1 BLS 0.19 0.96 4 0.63 0.21 17.3

WLS 2.65 0.67 4.02 4.00 0.66 2.24 74.7

OLS 1.00 0.87 4.00 0.71 3.54 75.2

2 BLS 2.38 0.99 5 0.90 0.72 27.0

WLS 2.79 0.97 5.00 5.00 0.96 1.88 71.4

OLS 0.22 0.99 5.00 0.99 80.8 93.3

Table 2. Predicted probabilities of β error for the two real data sets.

It should also be noted that for each of the real data sets in Table 2,

the biased regression coefficients differ according to the regression

technique. This is done to maintain the biased point (1Ha ,

1Hb ) at the same

Euclidean distance from the reference point (0,1), in the direction that

makes the set bias most likely to happen. This direction depends on the

covariance of the straight line coefficients, which changes according to the


201

regression technique used. In this way, the calculated probabilities of β

error can be compared under the same conditions for each of the three

regression methods.

Table 2 also shows the experimental errors for the different

regression methods and data sets in column ‘s2’. The predicted probabilities

of β error are in column ‘β’. Only when the point defined by the estimated

regression coefficients (a,b) falls within the joint confidence region for a

level of significance, set at 5% in this case (i.e. no bias is detected, Figure 5),

can the probabilities of β error be calculated.

-6 -4 -2 0 2 4 6

0.4

0.6

0.8

1

1.2

1.4

1.6

Slo

pe

Intercept

WLS

BLS

OLS

(0,1)

a) b)

Slo

pe

Intercept

WLSBLS

OLS

(0,1)

-20 -10 0 10 20

0.92

0.96

1

1.04

1.08

Figure 5. OLS (dashed line), WLS (dotted line) and BLS (solid line) joint confidence regions obtained for data set 1 (a) and data set 2 (b) (α=5%).

Data set 1. There are no significant differences between the results

of the two chromatographic methods, as the experimental point (a,b) falls

inside the joint confidence region centred at the theoretical point (0,1) for

the three regression techniques (Figure 5a). As an example, we decided that

in this case a bias of 4 in the intercept for the BLS method was too big to

remain undetected. For this intercept the most likely bias in the BLS slope is

0.63, which sets a distance of 4.017 between the points (1Ha ,

1Hb ) and (0,1)

(Figure 6a). For the WLS and OLS regression methods (Figures 6b and 6c),


202

the biased regression coefficients change slightly (due to the different sizes

of the joint confidence intervals and the different covariance between the

intercept and slope) so that the bias in the regression coefficients is most

likely to happen and the distance towards the point (0,1) remains at 4.017.

Under these circumstances, the probabilities of a β error considering BLS,

WLS and OLS regression methods are estimated at 17.3%, 74.7% and 75.2%

respectively (Table 2). These values are consistent with the volume of the

joint confidence distribution, centred at the point ),(11 HH ba and overlapped

within the projection the joint confidence interval. This is centred at the

reference point (0,1) and generated under BLS, WLS and OLS regression

conditions (Figures 6a, 6b and 6c). In each case, the level of significance was

5%. Since the estimated probabilities of β error are a direct function of s2,

for a constant distance between the points (1Ha ,

1Hb ) and (0,1) and a level of

significance α, high experimental errors (eq. 8) produce high probabilities

of committing a β error.

It is clear, therefore, that by neglecting the errors in the results of

both analytical methods (OLS conditions) or partially considering them

(WLS conditions), one would wrongly risk β errors of 75.2% and 74.7%,

respectively. This means that with WLS and OLS regression methods it

would be very difficult to detect an existing bias of the set magnitude

(Table 2) in both regression coefficients jointly, or in other words, in the

experimental results from the chromatographic method being tested. On

the other hand, with the BLS regression method, probability of β error was

estimated at 17.3%. This indicates that if method bias of the set magnitude

exists in the experimental results, it will be more likely to be detected when

errors in both axes are considered.


203

Slop

e

Intercept-4 0 4 8

0.2

0.6

1

1.4 a)

(0,1)

)ˆ,ˆ( ba

),(11 HH ba

S

lope

Intercept-4 0 4 8

0.2

0.6

1

1.4 b)

(0,1)

)ˆ,ˆ( ba

),(11 HH ba

Slo

pe

Intercept-4 0 4 8

0.2

0.6

1

1.4 c)

(0,1)

)ˆ,ˆ( ba),(

11 HH ba

Figure 6. Representation of the probability of β error (as seen in Figure 2, but from the z axis) for data set 1, under BLS (a), WLS (b) and OLS (c) regression conditions (α=5%).

Data set 2. The experimental point (a,b) falls within the joint

confidence region centred at the theoretical point (0,1) for the three

regression techniques (Figure 5b), and so there are no significant

differences between the results from the two amalgamation procedures

being compared with the BLS, WLS and OLS regression methods. In this

case, we decided to estimate the probability of not detecting a bias of 0.1 in

the BLS slope. As in the previous data set the intercept for which bias in the


204

regression coefficients are most likely to happen for the BLS method is 5.00.

For the WLS and OLS regression methods, the bias that is most likely to

happen in the intercept, for a bias of 0.1 in the slope, are present in Table 2.

In this way, the probabilities of committing a β error (not detecting the set

bias) according to the different s2 values are estimated at 27.0%, 71.4% and

93.3% for the BLS, WLS and OLS techniques respectively, for a level of

significance of 5%.

-20 -10 0 10 20 300.8

0.9

1

1.1

Slop

e

Intercept

a)

(0,1)

)ˆ,ˆ( ba

),(11 HH ba

-20 -10 0 10 20 300.8

0.9

1

1.1Sl

ope

Intercept

b)

(0,1)

)ˆ,ˆ( ba),(

11 HH ba

-20 -10 0 10 20 300.8

0.9

1

1.1

Slo

pe

Intercept

c)

(0,1)

)ˆ,ˆ( ba),(

11 HH ba

Figure 7. Representation of the probability of β error (as seen in Figure 2, but from the z axis) for data set 2, under BLS (a), WLS (b) and OLS (c) regression conditions (α=5%).


205

Figures 7a, 7b and 7c show that these values agree with the volume of the

joint confidence distribution centred at the point ),(11 HH ba intersected in

each case inside the projection of the respective joint confidence intervals.

As in the previous example, it is clear that by neglecting the errors

in the results of both analytical methods (OLS conditions) or partially

considering them (WLS conditions), the risk of committing a β error would

be wrongly higher than if the presence of errors in both axes is considered.

In this example it is especially important to note the differences for the OLS

regression method, as the risk of β error would be 93.3%. So, unlike with

the BLS regression method (risk of β error estimated to be 27.0%), it would

be very difficult to detect an existing bias of the set magnitude in the

mercury recovery results with two gold amalgamation steps if errors in

both axes are not considered.

CONCLUSIONS

We have developed a theoretical background and mathematical

expressions to interpret and estimate the probability of a β error, using the

joint confidence interval test for the slope and the intercept of a regression

line found by considering the errors in both axes. The immediate use in

measurement science is in the comparison of the results from two analytical

methods, but this can be extended for example, to comparisons between

analysts, laboratories, or between the chemical composition of various

samples.

We have found that if OLS or WLS regression methods are applied

on data with errors in both axes, estimates of the probability of committing

a β error, unlike those obtained if the BLS technique is used, are not correct.

Therefore, correctly detecting bias in the results from the analytical method

being tested might be very difficult when neglecting or partially


206

considering the uncertainties in both axes using OLS or WLS regression

methods. This might have serious negative effects for the results of future

analytical studies, as in these cases there might be a strong probability of

wrongly accepting biased analytical methods. Moreover, an important

advantage in method comparison studies is that, for the BLS regression

technique, the probabilities of β error when the axes are switched for given

1Ha and 1Hb values are the same.

We have also seen how important it is to obtain accurate estimates

of the experimental error s2. This regression parameter directly affects the

standard deviations of both slope and intercept, which is extremely

important for obtaining accurate estimates of the probability of β error. For

this reason, tests to detect lack of fit in the three regression methods are

strongly recommended. Moreover, in data sets with few data pairs there is

a higher probability that the experimental error s2 and therefore, the

standard deviations 1Hbs and

1Has are overestimated.29 In these cases it may

be preferable to consider 1Hb values of less than one when defining the

alternative hypothesis in order to reduce the overestimation of the

probability of β error. This is because the standard deviations 1Hbs and

1Has directly depend on 1Hb , and so

1Hb values of less than one can partially

offset the overestimation of 1Hbs and

1Has due to a high experimental error

s2. On the other hand, when 1Hb values below one are considered, the

estimated probability of β error may be underestimated depending on the

magnitude of the bias set.

Defining the values of the biased regression coefficients 1Ha and

1Hb

is one of the most difficult steps in estimating the probability of β error. For

this reason, we suggest the possibility of first setting one of the biased

regression coefficients and then calculate the other one so that the set bias


207

(defined by the point (1Ha ,

1Hb )) are most likely to happen. Also, estimates

of the probability of β error were inaccurate in simulated data sets with a

low number of data pairs (five in this case). This might be regarded as a

limitation of the estimation process, but these results are in fact not due to

our theoretical expressions, but to poor estimates of the experimental error

s2 when few experimental data is available.

ACKNOWLEDGMENTS

The authors would like to thank the DGICyT (project no. BP96-1008)

for financial support, and the Rovira i Virgili University for providing a

doctoral fellowship to A. Martínez.

APPENDIX A: Characterisation of the joint confidence distribution for

the slope and the intercept.

The first step towards finding the expression for the tri-dimensional

joint confidence distribution, that relates the different levels of significance

α (in the z axis) to the slope and the intercept (in the x and y axes), so that

the elliptical joint confidence regions from eq. 9 can be easily generated, is

to find the relationship between the level of significance α and the

parameter F1-α(2,n-2) in eq. 9:30

∫−α−=<=α− −α−

)2,2(1

0)2,2(1 d)()(Prob1 nF

Fn FFhFF (A.1)

where hF (F) is the statistical function that generates the F distribution,

which can be expressed as:31

( )( )

2

221

22

!2)4(!2)2()(

n

F nF

nnnFh

−

−+⋅

−⋅

−−

= (A.2)


208

This expression is obtained by considering 2 and n-2 degrees of freedom.

Introducing eq. A.2 to eq. A.1 and performing the stated integral:

−

−−α−

−α−

−

+−=<=α−2

2

)2,2(1)2,2(1 2

211)(Prob1

n

nn n

FFF (A.3)

and therefore:

−

−−α−

−α−

−

+=>=α2

2

)2,2(1)2,2(1 2

21)(Prob

n

nn n

FFF (A.4)

This expression gives the probability of a given F value of being higher

than a set threshold value F1-α(2,n-2), and therefore of not belonging to the F

distribution with 2 and n-2 degrees of freedom, for a level of significance α.

Isolating the term F1-α(2,n-2) from eq. 9 and substituting it into eq. A.4, we

find the expression for the joint confidence distribution for any level of

significance α:

−

−

===

−

−+−−+−

+=α

∑∑∑2

2

2

1

2H2

2

HH1

22

H1

2

)2(

)()()(2)(1

10

0H

00

0H

0

0H

nn

i e

in

i e

in

i e

sn

bbsxbbaa

sxaa

siii (A.5)

The α value (in the z axis) for which the bi-dimensional joint confidence

region is generated agrees with the volume of the distribution that falls

outside the projection of the elliptical region (i.e. risk of α error in the tri-

dimensional joint confidence distribution, Figure 2).


209

APPENDIX B: Defining the joint confidence interval for a given level of

significance α.

From eq A.5 we can generate the equations to span the joint

confidence region for a level of significance α, referred to as b1(a) and b2(a)

in eqs. 11 and 12. To do so, one of the two variables must be isolated (b was

chosen in this case). To provide clearer expressions, the three summation

terms in eq. A.5 have been renamed as: ∑=

=n

i eis

A1

2H

0H

0

1, ∑

=

=n

i e

i

isxB

12H

0H

0 and

∑=

=n

i e

i

isxC

12

2

H

0H

0

( )

−α−+−−++−=

−−

1)2()()(1)( 22

2HHH

2H

2HHHHH

H1 000000000

0

nnsCACBaabCaaBC

ab (B.1)

( )

−α−+−−−+−=

−−

1)2()()(1)( 22

2HHH

2H

2HHHHH

H2 000000000

0

nnsCACBaabCaaBC

ab (B.2)

These two functions span the upper and lower halves of the ellipse

that defines the joint confidence region for a given level of significance α.

They do not have real images for any intercept (a) value, but only for those

intercept values between a1 and a2 (eqs. 11 and 12). These are the two

unique a values that make b1(a)=b2(a) and thus define the two intercept-

axis points, where the two halves of the ellipse defining the joint confidence

region meet their ends. The expressions quantifying their values are:


210

−

−α−

+=

−−

2HHH

222

H

H1000

0

0

1)2(

BAC

snC

aa

n

(B.3)

−

−α−

−=

−−

2HHH

222

H

H2000

0

0

1)2(

BAC

snC

aa

n

(B.4)

REFERENCES

1.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P. J.

Lewi and J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics:

Part A, Elsevier, Amsterdam (1997).

2.- J. Mandel and F. J. Linnig Anal. Chem. 29, 743-749 (1957).

3.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York

(1987).

4.- M.A. Creasy, J. R. Statist. Soc. B, 18, 65-69 (1956).

5.- J. Mandel, J. Qual. Tech., 16, 1-14 (1984).

6.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van Nostrand

Reinhold, New York (1987).

7.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx and D.L. Massart, Anal.

Chim. Acta, 338, 19-40 (1997).

8.- J. Riu and F. X. Rius, Anal. Chem., 68, 1851-1856 (1996).


John Wiley & Sons, New York, 145-150 (1993).

10.- O. Güell and J. A. Holcombe, Anal. Chem., 60, 529A-542A (1990).


211

11.- J. M. Lisý, A. Cholvadova and J. Kutej, Comput. Chem., 14, 189-192

(1990).

12.- J. Riu and F. X. Rius, J. Chemometrics, 9, 343-362 (1995).

13.- P. Sprent, Models in Regression and related topics, Methuen & Co. Ltd.,

London (1969).

14.- K. C. Lai and T. K. Mak, J. R. Statist. Soc. B, 41, 263-268 (1979).

15.- C. L. Cheng and J. W. Van Ness, J. R. Statist. Soc. B, 56, 167-183 (1994).

16.- D. W. Schafer and K. G. Puddy, Biometrika, 83, 813-824 (1996).

17.- D. V. Lindley, Suppl. J. R. Statist. Soc. B, 9, 218-244 (1947).

18.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons, New York,

160-211, (1977).

19.- R. J. Carroll and D. Ruppert., Amer. Stat., 50 1 (1996).

20.- C. L. Cheng, Institute of Statistical Science, Academia Sinica,

Taipei,Taiwan, Republic of China, personal communication.

21.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, Y. Vander Heyden, P.

Vankeerberhen and D. L. Massart, Anal. Chem., 67, 4491-4499 (1995).

22.- M.R. Spiegel, Theory and Problems of Statistics, McGraw-Hill, New York

(1988).

23.- A. Martínez, J. del Río, J. Riu and F. X. Rius, Chemometrics Intell. Lab.

Sys., 49, 179-193 (1999).

24.- W. G. Snedecor and G. C. Cochran, Statistical Methods, Iowa State

University Press, Ames, Iowa (1989).

25.- A. Martínez, J. Riu and F. X. Rius, (1999), in preparation.

26.- G. B. Thomas Jr., R. L. Finney, Calculus and Analytic Geometry, Addison-

Wiley, Wilmington, Delaware, 943-950 (1987).

27.- P. De Vogt, J. Hinschberger, E. A. Maier, B. Griepink, H. Muntau and J.

Jacob, Fresenius J. Anal. Chem., 356, 41-48 (1996).

28.- I. Saouter, B. Blattmann, Anal. Chem., 66, 2031-2037 (1994).

29.- G.J. Hahn and W. Q. Meeker, Statistical Intervals, a guide for practitioners,

John Wiley & Sons, New York (1991).


212

30.- Cetama, Statistique appliquée à l’exploitation des mesures, Masson, Paris,

31 (Appendix) (1986).

31.- A. M. Mood, F. A. Garybill, Introduction to the Theory of Statistics,

McGraw-Hill, New York (1963).

4.6 Conclusions

213

4.6 Conclusions

Les conclusions extretes de l’article presentat a l’apartat 4.3 són

confirmades pels resultats obtinguts pel procediment de simulació de

Monte Carlo presentats a la secció 4.2. Aplicant el test conjunt sobre els

coeficients de regressió BLS estimats per cadascun dels 100.000 conjunts

globals (aquells amb els resultats experimentals de tots els analits alhora),

s’ha pogut demostrar que la detecció de diferències significatives entre els

resultats dels dos mètodes analítics en comparació es realitza de manera

correcta quan es consideren tots els resultats de l’anàlisi dels diferents

analits alhora. D’altra banda, amb el procés de simulació de Monte Carlo

també s’ha demostrat que quan s’aplica el test conjunt sobre els coeficients

de regressió BLS estimats a partir dels conjunts individuals (que contenen

els resultats experimentals de cadascun dels analits per separat), la detecció

de diferències significatives entre els resultats dels dos mètodes és molt

difícil. La causa és que l’estimació dels coeficients de regressió BLS amb un

nombre baix de valors experimentals produeix sobreestimacions de l’error

experimental s2 que generen, per a un determinat nivell de significança α,

intervals de confiança sobredimensionats. En aquest cas hi ha una

probabilitat elevada de cometre un error β, ja que acceptaria la hipòtesi

nul·la (H0) quan la correcta és la hipòtesi alternativa (H1).

D’altra banda, a l’apartat 4.5 s’ha demostrat que és possible estimar

la probabilitat de cometre un error de tipus β en l’aplicació del test conjunt

sobre els coeficients de regressió BLS. També s’ha comprovat que les

expressions matemàtiques desenvolupades proporcionen estimacions

correctes de la probabilitat de cometre un error β sempre que es disposi

d’una bona estimació de l’error experimental s2. Sota aquestes condicions,

considerant els mètodes de regressió OLS i WLS s’obtenen estimacions

errònies de la probabilitat de cometre un error de tipus β en comparació

amb l’estimació obtinguda amb el mètode de regressió BLS, que considera


214

els errors comesos en les mesures de les mostres mitjançant els dos mètodes

en comparació.

4.7 Referències

1.- Riu J., Rius F.X., Analytical Chemistry, 68 (1996) 1851-1857.

2.- Mandel J. and Linnig F.J., Journal of Quality and Technology, 16 (1984) 1-

14.

3.- Spiegel M.R., Theory and Problems of Statistics, McGraw-Hill: New York,

1988.




5-. Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John

Wiley & Sons: New York, 1993.

6.- Güell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A..



CAPÍTOL 5

Comparació de múltiples mètodes mitjançant l’anàlisi per components principals de màxima versemblança

considerant els errors en tots els eixos


217


Després d’haver tractat diferents aspectes de la comparació de dos

mètodes analítics mitjançant regressió lineal considerant els errors en els

dos eixos, en el cinquè capítol es presenta l’extensió lògica al camp

multivariant estudiant la comparació dels resultats obtinguts per múltiples

mètodes analítics considerant les incerteses degudes als errors comesos en

la mesura de mostres amb diverses concentracions. Aquesta comparació

pot ser aplicada, entre d’altres, al que es coneix en el camp de la química

analítica amb el nom d’estudis interlaboratori, emprats amb diferents

finalitats de les quals es poden distingir tres tipus:1

a) Estudis de certificació d’un material (material-certification studies). Hi

participen laboratoris especialitzats, de reconegut prestigi i competència,

que analitzen amb diferents mètodes un material per tal de determinar la

concentració d’un o més analits amb la menor incertesa possible. Per tant,

la finalitat d’aquests estudis és la de proporcionar materials de referència.

b) Estudis de col·laboració o d’aptitud d’un mètode analític (collaborative

studies or method-performance). S’utilitzen per establir les característiques

d’un mètode específic d’anàlisi, que normalment tenen a veure amb la

precisió.2,3 La International Standarization Organisation (ISO) ha publicat un

protocol que permet avaluar la precisió i el biaix d’un mètode analític, però

només considerant un nivell de concentració.4

c) Estudis d’aptitud de laboratoris (proficiency studies). Diversos

laboratoris participants que volen augmentar el seu nivell de qualitat

analitzen un o més analits d’un material. Els laboratoris participants fan

servir diferents mètodes analítics segons la disponibilitat. Els resultats

obtinguts per cadascun dels laboratoris són comparats entre ells i així

Capítol 5. Comparació de múltiples mètodes ...

218

poder millorar l’aptitud de cada laboratori per analitzar els analits

considerats.5,6 Per tant, en aquest cas l'objecte d'avaluació és el laboratori.

Tot i que la comparació dels resultats de diferents mètodes analítics

és útil en qualsevol dels tres casos descrits anteriorment, el seu ús és més

freqüent en els estudis d’aptitud de laboratoris (proficiency studies), en els

quals ens centrarem. Com es comenta més detalladament a l’apartat 5.4, en

aquest tipus d’estudis interlaboratori s’utilitzen uns tests estadístics per

establir diferents paràmetres de qualitat com ara la repetibilitat i

reproducibilitat en estudis de col·laboració o l’existència de biaixos en els

estudis d’aptitud d’un conjunt de laboratoris. Aquests tests estadístics,

però, presenten una sèrie d’inconvenients entre els quals cal destacar el fet

de considerar els resultats de les mostres amb diferents nivells de

concentració per separat i no conjuntament, o no tenir en compte les

incerteses degudes als errors comesos per cadascun dels laboratoris en

comparació en la mesura de les mostres.

Per tant, l’objectiu d’aquest capítol és el de desenvolupar una

tècnica que permeti comparar els resultats de diversos mètodes analítics

que solucioni els inconvenients característics dels tests estadístics

convencionals i amb la que es puguin tenir en compte les incerteses

degudes als errors en la mesura comesos pels diferents mètodes (o

laboratoris) en comparació. Aquesta tècnica de comparació de múltiples

mètodes es fonamenta en l’ús de l’anàlisi per components principals de

màxima versemblança (maximum likelihood principal component analysis,

MLPCA), del qual se’n detalla el funcionament a l’apartat 5.2, així com en

l’ús del test conjunt per l’ordenada a l’origen i el pendent estimats

mitjançant el mètode de regressió BLS. En l’apartat 5.3 es presenten els

resultats obtinguts del procés de validació d’aquesta tècnica de comparació

de múltiples mètodes. El seu funcionament es detalla en l’apartat 5.4 com

part de l’article Multiple analytical method comparison by using MLPCA and

linear regression with errors in both axes, enviat per la seva publicació a la


219

revista Analytica Chimica Acta. Finalment, les conclusions del capítol es

recullen en l’apartat 5.5.

5.2 Anàlisi per components principals de màxima

versemblança

MLPCA és un mètode de modelatge multivariant anàleg al d’anàlisi

per components principals (principal components analysis, PCA),7,8 però que

considera els errors comesos en la mesura dels diferents valors

experimentals continguts en la matriu R de dimensions m×n, per estimar els

coeficients del millor model p-dimensional possible des d’un punt de vista

de màxima versemblança. La descomposició de la matriu R mitjançant

MLPCA es pot representar com:

ETVEUSVR +=+= TT (5.1)

on E és una matriu m×n de residuals. Les matrius T i V tenen dimensions

m×p i n×p per un model p-dimensional, la matriu VT és la transposada de V

. Aquestes matrius contenen les estimacions de màxima versemblança dels

valors propis o scores i dels vectors propis o loadings respectivament. Cal

tenir present que les n columnes de la matriu R corresponen als mètodes o

laboratoris en comparació, mentre que les m files contenen els resultats de

les mostres de diferents concentracions analitzades per cadascun dels

mètodes. El model p-dimensional estimat amb MLPCA té la màxima

probabilitat de donar lloc a les mesures experimentals observades.9

MLPCA utilitza un mètode de mínims quadrats iteratiu per estimar els

paràmetres del model multivariant (rang p, scores i loadings) que minimitza

la suma dels residuals ponderats al quadrat. En aquest capítol hem assumit

que els errors en la mesura de les diferents mostres per cadascun dels

mètodes no estan correlacionats i que estan distribuïts de forma normal.


220

Sota aquesta condició, la funció objectiva minimitzada per MLPCA,

corresponent a la suma de residuals ponderats al quadrat, es defineix com:9

∑∑= =

−=

m

i

n

j ij

ijij rrS

1 12

22 )ˆ(

σ (5.2)

En aquesta equació les variables rij (elements de la matriu R)

representen les mesures de la mostra de concentració i pel mètode j. La

variable ijr és l’estimació de màxima versemblança de rij proporcionada pel

model i σij és la desviació estàndard vertadera comesa en la seva mesura.

En la pràctica és impossible obtenir els valors vertaders de les variables σij i

s’ha de treballar amb les seves estimacions sij. De forma anàloga al mètode

de regressió BLS, MLPCA considera implícitament que es coneixen els

valors vertaders σij. El fet de treballar amb les estimacions sij generades a

partir de les rèpliques efectuades en l’anàlisi de les m mostres pels n

mètodes afecta la qualitat de les estimacions dels coeficients del model.9 Tot

i això, ha estat una premissa en aquesta tesi doctoral considerar que

mètodes de modelatge com MLPCA i de regressió com BLS, que tenen en

compte les desviacions estàndard dels errors comesos en les mesures

experimentals, encara que aquestes siguin aproximades, són millors que

aquells que no les tenen en compte.

Per entendre més fàcilment la manera com l’algoritme MLPCA

estima el model multivariant, cal fer-ne una breu descripció del seu

funcionament. Abans de començar, però, és interessant visualitzar les

diferents matrius que intervenen en el procés MLPCA. La figura 1 mostra

la matriu de concentracions mesurades experimentalment (R) i la matriu

amb les corresponents variàncies (Q):


221

n

mnm

n

mrr

rrrrr

1

2221

11211

......

...

R

n

mnm

n

m

221

222

221

21

212

211

......

...

σσ

σσσσσ

Q

Figura 1. Matriu de concentracions R i variàncies vertaderes Q.

Durant el procés de modelatge, l’algoritme MLPCA fa servir les matrius de

covariàncies tant en l’espai definit per les files ( iΣ ) com per les columnes

( jΨ ). Com que s’ha assumit que les desviacions estàndard dels errors en

les mesures experimentals són independents, les matrius de covariàncies de

cadascuna de les files i de les columnes respectivament són diagonals. La

figura 2 representa aquestes matrius per la primera fila i la primera

columna de la matriu Q:

n

nn

21

212

211

1

0...

0

σ

σσ

Σ

m

mm

21

221

211

1

0...

0

σ

σσ

Ψ

Figura 2. Matrius de covariàncies per a la primera fila i columna de la matriu Q.

En aquest cas el model estimat per màxima versemblança ha de ser

equivalent en els dos espais, ja que la funció objectiva que s’ha de

minimitzar (eq. 5.1) en els dos casos és la mateixa:9

∑∑=

−

=

− ∆∆=∆∆=n

jjjj

m

iiiiS

1

1T

1

1T2 rΨrrΣr (5.3)


222

En aquesta equació ∆rj i ∆ri són vectors columna i fila de la matriu ∆R,

resultant de les diferències entre la matriu de concentracions mesurades R i

la matriu amb les estimacions predites de màxima versemblança R .

D’aquesta manera la matriu R serà la mateixa tant en l’espai definit per les

files com en el definit per les columnes. Per aquest motiu s’ha desenvolupat

un procediment de càlcul iteratiu que transposa la matriu R , de manera

que fa servir alternativament les estimacions de màxima versemblança en

l’espai generat per les files per tornar a calcular-les en l’espai definit per les

columnes.9 L’algoritme comença descomponent la matriu inicial m×n de

concentracions R mitjançant una descomposició per valors singulars

(singular value decomposition, svd). Al contrari del que passa amb PCA, en

MLPCA s’ha d’especificar inicialment el rang del model, ja que en aquest

cas el model de rang p no es pot obtenir a partir d’un model de rang

superior (models no aniuats):

1. Descomposició de la matriu R inicial.

[ ] ),(svd pRVS,U, = (5.4)

Les matrius U, S i V tenen dimensions m×p, p×p i n×p respectivament. El

segon pas consisteix a transposar la matriu R i calcular-ne les estimacions

de màxima versemblança en l’espai definit per les files. Per fer això

MLPCA utilitza projeccions que no són ortogonals al subespai definit pels

scores, sinó que estan ponderades per les incerteses en les concentracions

mesurades:

2.- Estimació de la matriu R en l’espai definit per les files.

TRR ⇒


223

iiii rΣVVΣVVr 1T11T )(ˆ −−−= (5.5)

En aquesta equació ri és un vector columna amb dimensions n×1 de la nova

matriu RT (vector fila de la matriu R original). D’aquesta manera s’obté

una estimació de la matriu R inicial transposada amb dimensions n×m, és a

dir TR , i per tant la funció objectiva es pot calcular segons l’equació 5.3:

∑=

− −−=m

iiiiiiS

1

1T21 )ˆ()ˆ( rrΣrr (5.6)

3.- Estimació de la matriu R en l’espai definit per les columnes.

En el tercer pas es fa una descomposició per valors singulars de la

matriu TR estimada en el segon pas:

[ ] ),ˆ(svd T pRVS,U, = (5.7)

En aquest cas les matrius U, S i V tindran dimensions n×p, p×p i m×p

respectivament. Es repeteix el segon pas però ara s’estima la matriu m×n R

en l’espai definit per les columnes:

TRR ⇒

jjjj rΨVVΨVVr 1T11T )(ˆ −−−= (5.8)

En l’equació 5.8 rj és un vector columna amb dimensions m×1 de la nova

matriu RT, és a dir, vector fila de la matriu original R. Ara el valor de la

funció objectiva es calcula segons l’expressió:


224

∑=

− −−=n

jjjjjjS

1

1T22 )ˆ()ˆ( rrΨrr (5.9)

4.- Càlcul del paràmetre de convergència.

En el quart pas es torna a descomposar en valors singulars la matriu

R estimada en el pas anterior:

[ ] ),ˆ(svd pRVS,U, = (5.10)

Donat que per obtenir una estimació de màxima versemblança de la matriu

R s’ha de complir que 22

21 SS = (eq. 5.3), es comprova la diferència entre els

seus dos darrers valors successius:

22

22

21

SSS −

=φ (5.11)

Si el valor de φ és inferior al límit de convergència establert (en aquest cas

10-10) el procediment finalitza. En cas contrari es torna al segon pas.

Cal destacar que en el cas de la comparació de mètodes analítics es

treballa amb la matriu original R sense centrar. Això és perquè en aquest

cas, el rang de concentracions és el mateix per a tots els mètodes en

comparació i per tant, les diferències entre els resultats dels diferents

mètodes són degudes exclusivament a errors (aleatoris o sistemàtics) en les

mesures experimentals de les diferents mostres. En el cas de les dades

espectroscòpiques el centrat de les dades és usat freqüentment, ja que així

s’eliminen les variacions de les absorbàncies que no són degudes a la

diferent composició de les mostres, sinó a la variació en la capacitat de

l’analit a absorbir radiació de diferents longituds d’ona.

5.3 Validació del procediment per comparar múltiples mètodes

225

5.3 Validació del procediment per comparar múltiples

mètodes

Com es descriu a l’apartat 5.4, el procediment desenvolupat per

comparar múltiples mètodes analítics està basat en l’aplicació del test de

confiança conjunta sobre els coeficients de la recta de regressió BLS.

Aquesta recta de regressió permet relacionar les m concentracions de

l’anàlisi de les mostres per cadascun dels mètodes analítics, amb les m

concentracions de la resta de mètodes analítics. Aquestes darreres m

concentracions són generades aplicant MLPCA sobre les concentracions de

l’anàlisi de les mostres per tots els mètodes analítics restants (veure figura 1

en l’apartat 5.4). Si quan es comparen les concentracions del mètode j respecte a les de la resta de mètodes, el punt definit pels coeficients de la

recta de regressió BLS cau dintre de l’interval de confiança generat per un

nivell de significança α, es pot concloure que no existeixen diferències

significatives entre les concentracions mesurades pel mètode analític j, en

comparació amb les concentracions obtingudes per la resta de mètodes.

El mètode de regressió BLS (veure apartat 1.4.1.2 de la introducció)

està basat en un procediment iteratiu de mínims quadrats, que al contrari

dels mètodes basats en el principi de màxima versemblança no té un

fonament matemàtic rigorós. Així doncs, per demostrar que la detecció de

diferències significatives entre els resultats dels diferents mètodes en

comparació mitjançant el nou procediment es fa de forma correcta, tot el

procediment per la comparació dels resultats de múltiples mètodes ha de

ser validat. Aquesta validació s’ha realitzat fent servir quatre conjunts de

dades inicials. Aquests conjunts de dades simulen els resultats de deu

mètodes o laboratoris analítics que analitzen tres analits a cinc nivells de

concentració. Els quinze punts resultants per a cada laboratori es troben

distribuïts de forma aleatòria al llarg d’un rang de concentracions que va

des de 0 a 100 unitats. Com que degut al número de mètodes en


226

comparació és impossible representar gràficament els conjunts de dades

inicials, a continuació se’n fa una descripció detallada.

Cada un dels quatre conjunts inicials simula una possible situació

en quant a la presència de biaix en els resultats d’algun dels mètodes. Així,

en un dels conjunts de dades inicials els resultats dels deu mètodes són

idèntics simulant un cas on tots els mètodes són comparables. En els altres

tres conjunts de dades inicials, el número de mètodes amb resultats

esbiaixats ha estat fixat en un, tres i cinc respectivament. En aquests casos

els resultats dels mètodes considerats com esbiaixats eren un 10% superiors

als dels mètodes no esbiaixats. D’altra banda, s’han considerat tres tipus

d’incerteses en cadascun dels quatre tipus de conjunts de dades inicials. Els

conjunts homoscedàstics, compostos per parells de dades amb desviacions

estàndard constants, sigui quin sigui el valor de la concentració. Els

conjunts de dades amb heteroscedasticitat es divideixen en dues classes;

per una banda aquells en els que les desviacions estàndard són un 10%

superior als valors individuals de les concentracions i per altra banda,

aquells on les desviacions estàndard varien aleatòriament entre un 6% i un

12% de cadascun dels valors individuals de concentració.

A partir de cadascun d’aquests quatre conjunts de dades inicials, es

van generar 100.000 conjunts de dades simulats mitjançant el mètode de

Monte Carlo.10,11 Com ja ha estat explicat en anteriors capítols, aquest

mètode de simulació genera nous conjunts de dades mitjançant l’addició

d’un error aleatori a cadascun dels valors individuals d’un conjunt inicial.

La magnitud d’aquest error aleatori depèn directament de les incerteses

associades a cadascun dels valors inicials. D’aquesta manera, quan s’aplica

el procediment per comparar múltiples mètodes per un nivell de

significança α sobre conjunts de dades simulant mètodes no esbiaixats,

només s’hauria de detectar diferències significatives en un α% dels conjunts

simulats. Ara bé, si el mètode en comparació proporciona resultats


227

esbiaixats respecte la resta dels mètodes, s’haurien de detectar diferències

significatives en els resultats d’aquest mètode en un percentatge de

vegades superior al percentatge observat pels mètodes no esbiaixats.

La figura 3 mostra els resultats obtinguts amb l’aplicació del

procediment per la comparació dels resultats de múltiples mètodes

analítics sobre els 100.000 conjunts de dades simulats amb el mètode de

Monte Carlo a partir de cadascun dels quatre conjunts de dades inicials per

diferents nivells de significança α.

a)

A B C D E F G H I J

1

5

10

Mètode Analítc

% R

esul

tats

sim

ulat

s esb

iaix

ats

b)

0

20

40

60

80

A B C D E F G H I JMètode Analítc

% R

esul

tats

sim

ulat

s esb

iaix

ats

c)

0

20

40

60

80


% R

esul

tats

sim

ulat

s esb

iaix

ats

d)

0

20

40

60


% R

esul

tats

sim

ulat

s esb

iaix

ats

Figura 3 . Resultats obtinguts de la validació del procediment de comparació dels resultats múltiples mètodes analítics mitjançant conjunts de dades simulats.


228

La figura 3a mostra els resultats en el cas de no tenir cap mètode esbiaixat.

Les figures 3b, 3c i 3d mostren els resultats obtinguts quan es van

considerar 1 (mètode A), 3 (mètodes A, E i J) i 5 (mètodes A, C, E, G i I)

mètodes esbiaixats respectivament. D’altra banda, els tres tipus de línies

representen els resultats obtinguts pels diferents tipus d’incerteses;

homoscedasticitat (línia contínua), heteroscedasticitat constant (línia

discontínua) i heteroscedasticitat aleatòria (línia puntejada). Per cadascun

dels quatre conjunts inicials i cadascun dels tres tipus d’incerteses, es van

considerar nivells de significança α del 1%, 5% i 10%.

Com es pot observar en la figura 3a, quan es simula que tots els

mètodes en comparació proporcionen resultats comparables, el percentatge

de vegades en el que es detecten diferències significatives per cadascun

dels deu mètodes respecte als altres nou, independentment del tipus

d’incertesa considerada, és aproximadament del α% en cada cas. Això és

equivalent a dir que el percentatge de vegades en els que es detecta

erròniament la presència de diferències significatives en els resultats de

cadascun dels mètodes en comparació amb els altres, és similar al nivell de

significança α que fixa la probabilitat de detectar erròniament un mètode

esbiaixat. Per aquesta raó, es pot concloure que el procediment per la

comparació de múltiples mètodes analítics no detecta biaix en els resultats

dels diferents mètodes quan realment no existeix.

D’altra banda, quan es simula que un dels mètodes està esbiaixat

respecte la resta, els resultats del procés de validació a la figura 3b mostren

que el procediment per comparar múltiples mètodes detecta correctament

el mètode (mètode A en aquest cas) que genera els resultats esbiaixats

respecte als de la resta de mètodes en comparació. Cal destacar que el

percentatge de vegades en els que es detecten diferències significatives en

els resultats dels mètodes no esbiaixats (del B al J) és lleugerament superior

al nivell de significança fixat en cada cas. Això és degut a que quan es


229

comparen els resultats d’un dels mètodes no esbiaixats amb la resta, entre

els nou mètodes restants es troba el mètode A que proporciona resultats

esbiaixats. Això fa augmentar les diferències entre els resultats del mètode

no esbiaixat en comparació als altres nou.

De manera similar, quan es simula que els mètodes que generen

resultats esbiaixats són tres (mètodes A, E i J), la figura 3c mostra que la

identificació d’aquests tres mètodes és possible mitjançant el procediment

per la comparació dels resultats de múltiples mètodes analítics. D’altra

banda, quan es comparen els resultats dels mètodes no esbiaixats (mètodes

B, C ,D, F, G, H i I) amb la resta hi ha un major percentatge de casos en els

que es detecten diferències significatives. De manera anàloga al cas

anterior, això és degut a que entre els nou mètodes amb els que el compara

cadascun dels mètodes no esbiaixats, hi ha tres mètodes que generen

resultats diferents a la resta, cosa que fa que augmentin les diferències entre

els mètodes no esbiaixats i els altres nou.

Finalment, a la figura 3d es mostren els resultats del procés de

validació quan es simula que cinc mètodes generen resultats esbiaixats i

que els altres cinc generen resultats correctes. Com es pot observar en

aquesta figura el procediment per la comparació de múltiples mètodes

detecta diferències significatives entre els resultats de cadascun dels

mètodes i els altres nou en un percentatge molt similar. Això és degut a que

en aquest cas les diferències entre els resultats dels mètodes analítics

esbiaixats i no esbiaixats són molt similars i per tant no existeix cap mètode

que sigui especialment diferent als altres nou: el procediment no pot

detectar amb fiabilitat quins mètodes són els esbiaixats quan

aproximadament un 50% dels mètodes donen lloc a resultats diferents a la

resta.


230

5.4 Multiple analytical method comparison by using MLPCA

and linear regression with errors in both axes (Analytica

Chimica Acta, sent for publication)





ABSTRACT

This paper discusses a new stepwise approach for comparing the

results from several analytical methods which analyse a set of analytes at

different concentration levels, taking into account all the individual

uncertainties produced by measurement errors. This stepwise comparison

approach starts detecting the methods that provide outlying concentration

results. The concentration results from each one of the remaining analytical

methods are then compared to the ones from the other methods taken

together, by using linear regression. To do this, the concentration results

from the methods considered together and their individual uncertainties,

are decomposed at each step to obtain a vector of concentrations. This is

achieved by a maximum likelihood principal component analysis

(MLPCA), which takes into account the measurement errors in the

concentration results. The bivariate least squares (BLS) regression method

is then used to regress the concentration results from the method being

tested at a given step on the scores generated from the MLPCA

decomposition (which have the information of the other remaining

methods), considering the uncertainties in both axes. To detect significant

differences between the results from the method being tested at a given

5.4 Multiple analytical method comparison ...

231

step and the results from the other methods (MLPCA scores), the joint

confidence interval test is applied on the BLS regression line coefficients for

a given level of significance α. We have used four real data sets to provide

application examples that show the suitability of the approach.

INTRODUCTION

Interlaboratory studies are used in analytical chemistry for a variety

of purposes. These may be proficiency tests (for comparing the

performance of several laboratories), collaborative studies (for validating a

standard method) or certification trials (to establish the true analyte

concentration in a reference material). Data from these interlaboratory

studies is usually statistically analysed to characterise different figures of

merit such as repeatability and reproducibility in collaborative studies, or

the existence of systematic bias (i.e. laboratory performance) in proficiency

tests. To date, systematic bias has been tested with different kinds of well-

known statistical tests such as ranking [1] and z-score methods [2]. Both of

these methods use scores to evaluate laboratory performance, but are

calculated in different ways. The ranking method uses a ranking of the

results from the different laboratories to calculate the scores. In this way,

the laboratory score is the sum of the ranks of the different samples. The z-

score method, on the other hand, uses the expression sxxz /)ˆ( −= , where x

is the result or mean of results obtained for a given concentration sample

by a laboratory, x is the best possible estimate of the true concentration,

and s the standard deviation of all the laboratories after outliers have been

eliminated. With good x and s estimates, the z-scores can be assumed to

follow a normal distribution. In both cases bias is detected for a given

laboratory if its score is beyond the lower or upper limits defined according

to the corresponding scores distribution in each case, for a given level of

significance α.


232

With these two methods we can detect the presence of significant

bias in the results from more than two laboratories at each concentration

level independently. In other words, they do not consider samples

containing different concentration levels simultaneously when checking the

presence of significant bias. However, it might be more suitable to consider

all the concentration samples simultaneously, since the main objective of

interlaboratory studies is to establish an overall performance index for each

laboratory. Moreover some statistical aspects discourage the use of the z-

score method; for instance when more than one test material is analysed. In

these cases the z-score method uses different kinds of combination scores,

such as RSZ or SSZ [3]. These combination scores are not generally

recommended for evaluating the performance of the laboratories when

determining one or more analytes in samples with different chemical

matrices [4]. This is because in these cases a statistically heterogeneous

population of z-scores might be obtained, which makes the assumption of

normality, on which the z-score method is based, no longer true.

For these reasons, this paper discusses a new approach, which,

unlike the existing ones, allows to identify methods providing outlying

concentration results and then detect significant bias in the concentrations

from at least one of the remaining methods in test. Moreover, this approach

can cope with heteroscedastic uncertainties from measurement errors (i.e. it

considers the different degrees of precision from the different methods in

comparison) generated by the replicate analysis of one or more analytes in

several concentration samples with similar or different chemical matrices.

In this way, the comparison is made by taking into account not only the

concentration values provided for each method, but also their uncertainties.

Once the outlying methods have been removed, the concentrations from

every method are compared to the scores from applying MLPCA on the

rest of methods, using linear regression and the joint confidence interval for

the intercept and the slope.


233

The approach presented here is general in nature, that is, it can be

applied to any experimental problem in which the concentration results

from analysing one or more analytes in different concentration samples by

several laboratories, analytical methodologies, techniques, analysts or

instruments are to be compared. To show the suitability of the new

approach, we used four real data sets and drew conclusions about the

validity of the laboratories in comparison according to the concentration

results from the different laboratories in each data set.


Notation

In this paper we have used bold uppercase characters to denote

matrices, bold lowercase characters to denote vectors and italic lowercase

characters to denote scalars. The true values of the different variables are

represented with Greek characters, while their estimates are denoted with

Latin ones. The variables used during the comparison process outlined in

Figure 1 are described as follows:

Detection of outlying methods. The concentration results from the

replicate analysis of m concentration samples by the n laboratories to be

compared are in a m×n matrix R. From the application of MLPCA on

matrix R the loadings of the first principal component are obtained in an

n×1 vector p. After using the Grubbs’ test on the elements of vector p, the

number of laboratories with outlying results is represented by variable l.


234

R

p

m

n

m

n

n

1

Outlier detected?

Yes

No

Single/PairedGrubb’s Test

MLPCA1 PC

Remove results from suspicious method(s) inmatrices R and var(R)

set j =1

MLPCA1 PC

tm

1

var(t)

m

1

set l = 0

l = l +1

k = n - l

Rm

k-1

m

k-1

BLS

0 20 40 60 800

20

40

60

t

jr Joint ConfidenceInterval Test

-0.5 0 0.5

0.9

0.95

1

1.05

1.1

Slop

e

Intercept

(0,1) (b0,b1)

j = k ?

End

No

j = j +1

Yes

var(R)

Extract jth column fromm×k matrices R and var(R)

to obtain vectors rj and var (rj)

var(R)

%5=α

k Remaining Methods

Initial Methods

Detection of Outliers

Estimation of scores with MLPCA

BLS regression and joint confidence interval test

Figure 1. Scheme of the overall process for comparing the concentration results from multiple laboratories. See the Notation section for a description of the variables.

Estimation of scores with MLPCA. After the l laboratories with

outlying results have been eliminated, k laboratories are left to be

compared ( )lnk −= . k steps are therefore necessary to compare the

concentration results from each one of the laboratories with the others. In


235

the jth step ( )kj <<1 the results from the jth laboratory (column vector jr

in the new m×(k-1) matrix R), are compared to those from the other k-1

laboratories in matrix R. The application of MLPCA on matrix R produces

a (k -1)×1 vector of loadings p and a m×1 vector of scores t for the first

principal component. The individual variances of the concentration results

from the replicate analysis of the ith (1<i<m) concentration sample by the k-

1 methods are in the diagonal (k-1)×(k-1) matrix ∑i (uncorrelated

measurement errors are considered). Projecting each one of these m

diagonal covariance matrices onto the scores subspace yields the scalar 2it

s .

This is the estimate of the true variance ( 2it

σ ) of the ith score in t. Therefore

the m×1 vector of variances var(t) comprises the m 2it

s values.

BLS regression method and joint confidence interval test. In the jth step

the BLS technique is used to regress the m×1 vector of concentrations jr on

the m×1 vector of scores t, considering the uncertainties associated to the

elements of both vectors. Estimates of the variance for the elements of jr

from the replicate analysis of the ith concentration sample are denoted as 2ijrs , while their true values are 2

ijrσ . The true values of the BLS regression

coefficients are β0 (intercept) and β1 (slope), while their respective estimates

are b0 and b1. The estimates of the standard deviation of the intercept and

the slope for the BLS regression line are 0bs and

1bs respectively. The true

experimental error (residual mean square error), expressed in terms of

variance for the m data pairs ( it , ijr ), is 2σ and its estimate is 2s . The

predicted values of the results from the jth method being tested from the

BLS regression line are ijr .


236

Maximum likelihood principal component analysis (MLPCA)

Since the decomposition of matrix R using MLPCA is essential in

this comparison approach we believe that it may be useful to note some

important points. In this multiple comparison approach, and without any

loss of generalisation, methods will be treated as the sensors in

spectroscopic data. MLPCA allows to estimate the multivariate model

taking into account the uncertainties of each concentration result due to

measurement errors, so that non-orthogonal projections of the original data

into the scores subspace [5] are obtained. In this case MLPCA projects the

concentrations and standard deviations onto a one-dimensional space

defined by the first principal component (PC) taking into account the

uncertainties in all the individual concentrations. As there is a linear

relationship between the true concentrations from all the methods, most of

the variance in the scores is explained by the first PC, even when

concentrations are affected by measurement errors. We have seen that the

minimum percentage of variance explained by the first PC in any of the

data sets studied (not only those in the experimental section) was never less

than 96%.

Data pairs with lower individual uncertainties (supposedly those

with lower measurement errors) are those from which MLPCA extracts a

greater amount of information to estimate the multivariate model

parameters (i.e. scores and loadings). In this way, even when data from any

of the methods in test is missing, the loadings of the first PC from MLPCA

can still retain most of the original chemical information [6]. To do so, high

standard deviations are associated to the estimates of the missing

concentration values. Therefore these are not taken into account by MLPCA

to estimate the loadings of the first PC.


237

Detection of outlying methods

To identify those analytical methods that provide outlying

concentration results, so that they do not produce wrong conclusions about

the existence of significant bias in the concentration results from the other

methods in test, it is first necessary to decompose the concentrations in the

initial m×n matrix R with MLPCA. This produces a n×1 loading vector p

with information about the performance of the methods in test (Figure 1).

These loadings are distributed around a theoretical value of nn , since

this would be the value of all the loadings if the true concentrations were

considered. This is shown in Figure 2 where, in a three-dimensional case,

the loading values are equal to 33 when all three methods provide

concentration results that are identical to the true values. In this way, if

systematic bias in the concentration results from a given method is big

enough, the corresponding loading value will differ from the rest. We have

checked with normal probability plots (results available on request to the

authors) that the distribution followed by the loadings in p is normal.

Because the single (or paired) Grubbs' test is based on the assumption of

normality [7] it allows to detect, for a level of significance α of 2.5% (2-tails)

[8], the loadings that can be considered as outliers and therefore, the

methods that should be removed from the initial m×n matrix R.


238

1

1

1

0

2

3

Method 1

Method 2

Method 3

α

β

γ

33

31)cos()cos()cos( ==== γβα

PC 1

Figure 2. Loading values (cos(α), cos(β) and cos(γ)) in the hypothetical case that the three laboratories provide identical results to the true concentrations.

Estimation of scores with MLPCA

After the l outlying methods have been eliminated, the

concentration results from the k remaining methods are compared

(Figure1). To do this k steps are carried out so that in each one the

concentrations from one method (in the m×1 vector jr ) are compared using

linear regression to the m×1 vector of scores t containing information about

the concentrations of the k-1 remaining methods. These scores are

generated from the MLPCA decomposition of the concentrations from the

methods that comprise the m×(k-1) matrix R in each step, taking into

account the uncertainties of the concentrations in matrix R. To estimate the

uncertainties of vector t, the uncertainties of the concentrations in matrix R


239

(usually obtained by the replicate analysis of the m samples) are projected

on the scores subspace. By error propagation this can be expressed as [6]:

( ) 1-1T2 −= pΣp iti

s (1)

Estimates of the variances from eq. 1 are approximate because this

expression does not consider the uncertainty inherent to the principal

components from MLPCA. However, it gives an indication of the precision

of the replicate measurements [6], which the BLS regression method takes

into account to find the regression coefficients of the regression line in each

of the k steps.

Before comparing the results in jr with the scores in t in the jth

step, the scores must be first scaled because the units of the scores are

different from the ones of the concentrations. In other words, the scores are

the coordinates of the concentrations in a different coordinated system (one

PC), whose direction is defined by the loadings in p. In the hypothetical

case where all the results from the k-1 methods were identical to the true

concentration values, the m×1 vector of scores t should perfectly fit a

straight line corresponding to the first PC. As stated previously, the

loadings in p would be equal to )1(1 −− kk (Figure 2). In such case, the

differences between vectors t and jr are therefore due to the change of the

coordinated system, following the relation 1−⋅= kjrt , (Figure 2) for any

j (1<j<k). This shows that the scores in t should be divided over 1−k to

offset the change of scale caused by the projection of the m×(k-1) matrix R

onto the first PC.


240


Once the scores in vector t have been obtained from applying

MLPCA on m×(k-1) matrix R in the jth step, they have to be regressed

against the concentration values from the jth analytical method jr

considering all the individual uncertainties. From all the existing least

squares approaches for estimating the regression coefficients when

measurement errors in both axes are present, Lisý’s method [9] (referred to

as BLS) was found to be the most suitable [10]. This technique assumes that

the true linear model between the error-free scores ξi (from applying

MLPCA on the error-free concentration results from the corresponding k-1

methods in the jth step) and the error-free results ηij from the

corresponding jth method is:

iij ξββη 10 += (2)

The true variables ξi and ηij are unobservable; instead, only the following

variables can be measured:

iiit γξ += (3)

ijijijr δη += (4)

The random errors made generating variables it and ijr , are represented by

variables γi, and δij where ),0(N~ 2iti σγ and ),0(N~ 2

ijrij σδ . In this way,

introducing eqs. 3 and 4 into eq. 2 and isolating the variable ijr , the

following expression is obtained:

iiij tr εββ ++= 10 (5)


241

where εi is the ith true residual error with ),0(N~ 2ii εσε [11] and can be

expressed as a function of γi, β1 and δij:

ijii δβγε 1−= (6)

Several authors have developed procedures to estimate the

regression line coefficients based on a maximum likelihood approach

whenever errors in both variables are present [12-15]. In most cases these

methods need the true predictor variable to be carefully modelled [14]. This

is not usually possible in chemical analysis, where the true predictor

variables iξ are not often randomly distributed (i.e. functional models are

assumed). Moreover, there are cases in which the experimental data is

heteroscedastic and estimates of measurement errors are only available

through replicate measurements (i.e. the ratio iji rt σσ is non-constant or

unknown). These conditions, common in chemical data, make it very

difficult to rigorously apply the principle of maximum likelihood to the

estimation of the regression line coefficients. On the other hand, Sprent [11]

presented a method for estimating the regression coefficients using a

maximum likelihood approach, even when a functional model is assumed.

This method is not rigorously applicable when individual heteroscedastic

measurement errors are considered. Moreover, it has been shown that

when assuming iji rt λσσ = for any i, least squares methods provide the

same estimates of the regression coefficients as those from a maximum

likelihood estimation approach [16]. For these reasons, we have chosen an

iterative least squares method (i.e. the BLS method) which can be used on

any group of ordered pairs of observations with no assumptions about the

probability distributions [16]. This allows the application of this method in

real chemical data when individual heteroscedastic errors in both axes are

considered. In this way, the BLS regression method relates the measured

variables it and ijr as follows [17]:


242

iiij etbbr ++= 10 (7)

Where ei is the observed ith residual error. The variance of ei is 2ies and will

be referred to as the weighting factor. This parameter takes into


axes ( 2it

s and 2ijrs ). The covariance between the variables for each ( it , ijr )

data pair, which is normally assumed to be zero, can also be taken into

account:

),(cov2)var( 1

221

210

2ijitriije rtbsbstbbrs

iiji−+=−−=

(8)


coefficients by minimising the sum of the weighted residuals, S:

2

12

210

12

2

)2()()ˆ(

sns

tbbrs

rrS

n

i e

iijn

i e

ijij

ii

−=−−

=−

= ∑∑==

(9)

The experimental error s2 is an important variable since it provides

a measure of the dispersion of the data pairs around the regression line and

can provide a rough idea of the lack of fit of the experimental points to the

regression line. The BLS regression technique assigns lower importance to

those data pairs with larger 2it

s and 2ijrs values, i.e. the most imprecise data

pairs. In this way, estimates of the missing concentrations (to which we will

associate high standard deviations) have a minimal influence on the BLS

regression line coefficients. By minimising the sum of the weighted



process [18].


243

Joint confidence interval test

To check whether there is significant bias in the concentration

results from the jth method (vector rj) in comparison to the scores from the

corresponding k-1 remaining methods in the jth step (vector t), the joint

confidence interval test for the slope and the intercept [18] must be applied

to the BLS regression coefficients for a given level of significance α. As

shown in an earlier paper, the BLS regression method allows to compare

the concentration results from analysing more than one analyte [19] in

several concentration samples with different chemical matrices. If the

intercept and the slope of the regression line do not simultaneously show

significant differences from the reference values of 0 and 1 respectively, it

can be concluded that the results from the jth method are comparable to

those from the remaining k-1 ones for a given level of significance α. In this

case, the experimental point (b0, b1) falls within this joint confidence

interval centred at the reference point (0,1) and the null hypothesis H0 can

be accepted [20]. H0 assumes that both BLS regression coefficients belong to

a joint confidence distribution centred at the reference point

)1,0(0H0H 10 == bb . The joint confidence region is defined for a given level

of significance α by all those values (b0, b1) that satisfy the equation:

∑∑∑=

−α−==

=−+−−+−n

im

e

in

i e

in

i e

Fsbbst

bbbbst

bbs

iii 1)2,2(1

22112

2

11001

22

001

2 2)()()(2)(10H

0H

0H0H

0H

0H

0H

(10)

where F1-α(2,m-2) is the tabulated F value for a level of significance α with 2

and m-2 degrees of freedom. The term 2

0Hies is the weighting factor

associated to the reference regression line coefficients 0H0b and

0H1b from


244

which H0 is postulated and can be recalculated from eq. 8 considering

0H11 bb = [20].

It is interesting to check lack of fit of the experimental points ( it , ijr )

to the BLS regression line. When lack of fit is present, the residual mean

square error s2 (eq. 9) tends to be overestimated, and joint confidence

regions may therefore be too large. In this case there would be a greater

probability for a given method that bias would remain undetected, i.e.

there would be a greater probability of committing a β error [20].

Unfortunately, a rigorous test for detecting lack of fit based on the analysis

of the residual variance [21] cannot be applied because replicates for each

data pair ( it , ijr ) are not available. This is because the scores it and their

respective standard deviations it

s , are directly generated from the

projection of m×(k-1) matrices R and var(R). The only option would be to

apply a χ2 test on the residual mean square error estimate s2, which is a

random variable that can be approximated by a χ2 distribution with m-2

degrees of freedom. However, this is a rough test for detecting lack of fit,

because a chi-squared distribution is justified by the asymptotic theory only

in large samples [22] This condition is not usually met in linear regression,

where the number of samples is limited. For this reason, we decided not to

apply this test since conclusions about lack of fit could be misleading.

The suitability of the approach developed for multiple method

comparison was initially examined with different kinds of simulated data

sets (results are available on request to the authors). The simulated data

sets were generated using the Monte Carlo method [23,24]. Different

uncertainty patterns, which in some cases contained results from methods

simulating bias, were considered. When the multiple method comparison

approach was applied on those data sets in which no biased methods were

simulated for a given level of significance α, bias was detected in each


245

method in approximately an α% of the cases (i.e. the theoretical results one

expects to find if the procedure is correct). On the other hand, when

concentration results from one or more methods were simulated to be

biased, the percentage of times in which bias was detected in these biased

methods was significantly higher than for the unbiased ones.


Data sets

Data Set 1 [25]. The total Cr content (mg/kg) is determined in six

soil samples using four different separation/extraction methods: HNO3,

deionized water, KCl and acetate buffer solution. The total Cr content in

the soil samples ranges between 1.1 and 1420 mg/Kg. Heteroscedasticity is

present in the data set in such a way that each standard deviation ranges

between 2% and 14% of each individual value. Only one individual point

exceeds this range with a standard deviation of 68% of the individual mean

values. Figure 3a shows the Cr content in the different samples and their

standard deviations from replicate analysis by the four

separation/extraction methods.

Data Set 2 [26]. Collaborative study conducted on a liquid

chromatographic method for determining taurine in infant formula and

milk powders. Twenty laboratories participated in the analysis of eight

blind duplicates ranging from approximately 3 to 65 mg/100 g of sample.

Heteroscedasticity ranges from 0.1% to 29 % of the individual mean values.

In six cases one or both concentration values were missing and thus high

standard deviations (about ten times higher than the others) were

associated to the substituting taurine concentration values (see Figure 4a).


246

Data Set 3 [27]. Five-method comparison study to determine

polycyclic aromatic hydrocarbons (PAH’s, ng/g) in twelve samples of the

sewage sludge CRM 392: Soxhlet method with toluol, SFE with CO2, SFE

with CO2 and 5% toluene, SFE with CO2 and TEA in toluene, and SFE with

CO2 and TFA in toluene. PAH’s concentrations range from 46 to 1071 ng/g.

Heteroscedasticity is high in the data set and the standard deviations range

from 0.3% to 37% of the individual mean values. Concentrations of PAH’s

in the thirteen samples and their standard deviations are shown in Figure

5a.

Data Set 4 [28]. Collaborative study involving fourteen laboratories

to test a gas chromatographic method for determining putrescine in

seafood. This data is obtained from the analysis of putrescine in fourteen

canned tuna and raw mahimahi (including blind duplicates and a spike).

The putrescine content ranges from 0.2 to 9.2 ppm. Heteroscedasticity is

present in the data set, so the standard deviations range from 0.03% to 52%

of the individual mean values. Figure 6a shows the putrescine

concentrations and their standard deviations from the duplicate analysis of

the fourteen samples by the fourteen laboratories.

Computational Aspects

The calculations performed in this study were carried out with a

Pentium III-based personal computer with 64 Mb of memory and a clock

speed of 500 Mhz. Although the MLPCA algorithm has been reported to be

time consuming with spectroscopic data [5], the time needed to compare

the concentration results from the analytical methods in test for all the real

data sets was never more than 3 minutes since the data sets used are

smaller than the spectroscopic ones. All the algorithms were written in

Matlab (Matlab for Microsoft Windows ver. 5.2, The Mathworks, Inc.,

Natick, MA).


247


Results from the multiple-method comparison approach are

presented in Figures 3 to 6 and Tables 1 to 3. In each figure, the first plot

shows the concentrations (solid lines) and their standard deviations

(dashed lines) from the replicate analysis of the samples by the different

laboratories, whereas the second plot shows the loading values used for

detecting outliers. With these second plots it is possible to visually identify

any suspicious loading values, on which the Grubbs’ test for detecting

outliers (single or paired) is applied for a level of significance of 2.5% (2-

tails). On the other hand, values in Tables 1 to 3 under column ‘α’ show the

maximum level of significance for which no bias can be detected using the

joint confidence interval test on the BLS regression coefficients from

regressing results from the jth method against the scores in t in the jth step.

In this way, if the level of significance is equal to or higher than 5% (set in

this case for the BLS joint confidence interval test), the differences between

the results from each method in comparison to the other ones will not be

significant and vicecersa (in this latter case the levels of significance α are

highlighted in bold).

Data Set 1. The concentration results from the four

separation/extraction methods in Figure 3a show that results from the

method using HNO3 are systematically higher than those from the other

methods. This is confirmed by the plot of the loadings of the first PC from

MLPCA (Figure 3b), since the loading value from the results of the method

using HNO3 is much higher than the loadings from the other methods. The

single Grubbs’ test showed that the suspicious loading value should be

considered as an outlier and therefore, the systematic differences between

the concentration results from the method using HNO3 and the others were

significant for a level of significance of 2.5% (2-tails). Once the results from

this method were eliminated no outlying loading values were detected


248

(Figure 3c) and the second stage of the multiple-method comparison

approach could therefore be applied on the results from the three

remaining methods.

The maximum levels of significance for which no bias can be

detected using the joint confidence interval test on the BLS regression

coefficients estimated from the results of the separation/extraction

methods using HNO3, KCl and the acetate buffer solution are 69.5%, 70.9%

and 61.1% respectively. Therefore, none of the results from the three

methods for analyzing the total Cr content in the six soil samples is

significantly different from the rest. This conclusion is logical since

concentration results from methods 2, 3 and 4 are very similar (Figure 3a).

1 2 3 4 5 60

1000

1500

Sample Number

Cr c

onte

nt (

mg/

kg)

Method usingHNO3a)

500 ~ ~~ ~130

Figure 3. Plot a): Cr contents (solid lines) and their standard deviations (dashed lines) in the six soil samples analysed by the four methods in data set 1. The bold solid line represents results from method using HNO3 (outlier).


249

0.4 0.45 0.5 0.55 0.6 0.65 0.7

123 4

44

Loadings

23 4

33

b)

c)

1.- H NO 32.- H 2O 3.- K Cl4.- A cetate

Figure 3 (cont.). Plots b) and c): Loadings for detecting outliers with an outlier and after eliminating the results from the outlier laboratory respectively.

Data Set 2. Figure 4a shows that the concentration results from

laboratory 14 are quite different from those of the other laboratories for

some of the samples analysed. The loading plot for this data set (Figure 4b)

confirms this observation since the loading value from the results of

laboratory 14 is far from the others. The Grubbs’ test showed that the

loading value for the results from laboratory 14 should be considered as an

outlier. After eliminating the results from laboratory 14, the Grubbs’ test

was again applied on the loadings from the nineteen remaining

laboratories and no outlying loading values were detected (Figure 4c).


250

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

Sample Number

Tau

rine

con

tent

(m

g/10

0g)

Laboratory 14a)

0.21 0.22 0.23 0.24 0.25 0.26

9

1919

Loadings

9 14

2020b)

c)

Figure 4 (cont.). Plot a): Concentration values (solid lines) of taurine and their standard deviations (dashed lines) in the eight duplicate samples when analysed by the twenty laboratories in data set 2. The bold solid line represents results from method 14 (outlier).. Plots b) and c): Loadings for detecting outliers, with an outlier and after eliminating the results from the outlier laboratory respectively.


251

Method α (%) Method α (%)

1 1.0 11 71.9

2 46.6 12 58.2

3 97.9 13 11.3

4 14.1 15 42.8

5 66.5 16 3.5

6 66.3 17 55.7

7 37.3 18 12.7

8 1.4 19 2.2

9 0.029 20 21.3

10 69.2

Table 1. Maximum levels of significance (α) for which no bias is detected between the results of the different methods in data set 2.

Table 1 shows the results from the application of the joint

confidence interval test on the BLS regression coefficients. Significant

differences were detected in the results from laboratories 1, 8, 9, 16 and 19 .

Results from laboratory 9 appear to be especially different from the rest

since the α value for which bias would not be detected is extremely low.

This is confirmed by the high loading value for laboratory 9 in Figure 4c,

which indicates that concentration results from this laboratory are higher

than those from the others.

Data Set 3. Figure 5a shows that the concentration results from the

analysis of PAH’s with the five extraction methods are quite different. The

plot of the loadings in Figure 5b shows that the results for each method

increase, which indicates that the extraction efficiencies from each method

get higher and higher. In this example no outlying methods were detected

with the single or paired Grubbs’ test. Table 2 shows that the different α

values are lower than the threshold value of 5%. This indicates that,

because of the big differences in the concentration results from the five


252

extraction methods in Figure 5a, the results from the five methods are

significantly different between them for a level of significance of 5%.

Method α (%)

Soxhlet 0.47

SFE/CO2 0.0027

SFE/CO2+Toluene 1.62

SFE/CO2+TEA 0.16

SFE/CO2+TFA 0.0071


Sample Number

PAH

’s c

onte

nt (n

g/g)

2 4 6 8 10 120

200

400

600

800

1000

a)

Figure 5. Plot a): Concentration (solid lines) of PAH’s and their standard deviations (dashed lines) in thirteen samples when analysed by the five methods in comparison from data set 3.


253

0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6

1 2 3 4 5

55

Loadings

b)

1.- Soxhlet2.- SFE/CO23.- SFE/CO2+Toluene4.- SFE/CO2+TEA5.- SFE/CO2+TFA

Figure 5 (cont.). Plot b): Loadings for detecting outliers among the five methods in comparison from data set 3.

Data Set 4. Figure 6a shows that the concentration results from

laboratories 10 and 14 are the highest and the lowest respectively. Figure 6b

shows that the loadings for these two laboratories are much over and much

under the theoretical loading value of 1414 respectively. This indicates

that results of the determination of putrescine in seafood from laboratories

10 and 14 are systematically higher and lower, respectively, than those

from the other laboratories. These two loadings were detected as outliers

when the paired Grubb’s test for the stated level of significance of 2.5% (2-

tails) was applied. Once both laboratories were eliminated, the loading plot

in Figure 6c was obtained MLPCA was applied on the results of the

determination of putrescine from the remaining twelve laboratories.


254

1 2 3 4 5 6 7

1

3

5

7

9

Sample Number

Putre

scin

e co

nten

t (pp

m) Laboratory 10

Laboratory 14

a)

0.1 0 .2 0.3 0.4

1014

Loadings

1414

1212

b)

c)

Figure 6. Plot a): Putrescine concentration values (solid lines) and their standard deviations (dashed lines) in fourteen samples when analysed by the fourteen chromatographic methods in data set 4. Bold solid lines represent results from methods 10 and 14 (outliers). Plots b) and c): Loadings for detecting outliers with two outliers and after eliminating the results from the outlier laboratories respectively.


255

Table 3 shows that bias was detected in the results from laboratories

2, 4, 12 and 13. Since the lowest α values in Table 3 are from the

concentration results from laboratories 12 and 4, it can be concluded that

these laboratories provide the most different concentration results.

Method α (%) Method α (%)

1 83.2 7 13.9

2 2.7 8 8.1

3 5.0 9 52.6

4 0.026 11 73.2

5 30.1 12 0.0013

6 53.6 13 0.99


On the other hand, although results from laboratory 3 were not

significantly different from the others, doubts may arise about the

performance of this laboratory when analysing putrescine in seafood with

the chromatographic method being tested, since the α value in Table 3 is

equal to the threshold value of 5%.

CONCLUSIONS

In this paper we have developed a stepwise multiple method

comparison approach that allows to detect significant bias by comparing

the results from several analytical methods or laboratories considering their

individual heteroscedastic uncertainties (i.e. their different levels of

precision). This approach can be applied even when more than one analyte

in several concentration samples with different chemical matrices are

analysed. Moreover, unlike the existing approaches, this one

simultaneously considers all the concentration results and their respective


256

uncertainties to detect methods reporting outlying concentration results. In

this way, it is possible to have a clear view of the overall performance of

each analytical method.

Real data sets were used to check the suitability of the multiple

method comparison approach. Conclusions about laboratories providing

outlying concentration results or results with significant bias in comparison

to the rest, seemed to agree with the differences observed in the plots of the

concentration results from the different laboratories. Although this

comparison procedure has been used to compare results from methods or

laboratories, it can be applied to any experimental problem in which results

from the analysis of several analytes in various chemical matrices at

different concentration levels are obtained with their respective individual

uncertainties.

Despite the promising results, the researcher should be aware of

two important points. The first, which is inherent to both MLPCA and BLS

techniques, is that the uncertainties from the replicate analysis of the

different concentration samples need to be known, so a greater

experimental effort is therefore required to carry out a sufficient number of

replicate analyses [29]. Although in some of the real data sets the number of

replicates was very low (only two), it has been a premise of this work to

assume that a comparison approach that accounts for approximate

estimates of the individual heteroscedastic uncertainties is better than one

that does not consider them at all. The second is that the BLS regression

technique is not very robust in the presence of outliers with low individual

uncertainties. This limitation will be addressed in future works [30].


257

ACKNOWLEDGEMENTS

The authors thank the DGICyT (project no. BP96-1008) for financial

support, and the Rovira i Virgili University for providing a doctoral

fellowship to A. Martínez.

BIBLIOGRAPHY

1.- G.T. Wernimont, Use of statistics to develop and evaluate analytical methods,

AOAC, Arlington, V.A., 1987.


3.- D. L. Massart, B. M. G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.

Lewi and J. Smeyers-Verbeke, Handbook of Chemometrics and

Qualimetrics, Part A, Elsevier, Amsterdam, 1997.

4.- J. Kucera, P. Mader, D. Miholová, J. Száková, I. Stejskalová and V.

Stepánek, Fresenius J. Anal. Chem., 360 (1998) 439-442.

5.- P. D. Wentzell, D. T. Andrews, D. C. Hamilton, K. Faber and B. R.

Kowalski, J. Chemom., 11 (1997) 339-366.

6.- P. D. Wentzell and D. T. Andrews, Anal. Chim. Acta, 350 (1997) 341-352.

7.- Cetama, Statistique Appliquée à l’Exploitation des Mesures, 2nd ed.,

Masson, Paris, 1986.

8.- AOAC International Guidelines for Collaborative Study Procedures to

Validate Characteristics of a Method of Analysis, J. AOAC Int., 78 (1995)

143A- 160A.

9.- J.M. Lisý, A. Cholvadová, J. Kutej, Comput. Chem., 14 (1990) 189-192.

10.- J. Riu, F.X. Rius, J. Chemom., 9 (1995) 343-362.

11 .- P. Sprent, Models in Regression and related topics, Methuen & Co.

Ltd., London, 1969.

12.- W. A. Fuller, Measurement Error Models, John Wiley & Sons, New

York, 1987.

13.- C. L. Cheng and J. W. van Ness, J. R. Stat. Soc. B, 56 (1994) 167-183.

14.- D. W. Schafer and K. G. Puddy, Biometrika, 83 (1996) 813-824.


258

15.- K. C. Lai and T. K. Mak, J. R. Stat. Soc. B, 41 (1979) 263-268.

16.- D. V. Lindley, J. R. Stat. Soc. B / London Suppl. Series B, 9 (1947) 218-

244.

17.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons, New

York, 1977.

18.- J. Riu and F.X. Rius, Anal. Chem., 68 (1996) 1851-1857.

19.- A. Martínez, J. Riu, O. Busto, J. Guasch and F. X. Rius, Anal. Chim.

Acta, 406 (2000) 257-278.

20.- A. Martínez, J. Riu and F. X. Rius, J. Chemom., submitted for

publication.

21.- A. Martínez, J. Riu and F. X. Rius, Chemom. Intell. Lab. Syst., accepted

for publication.

22.- P. Bentler and D. G. Bonett, Psychological Bulletin 88 (1980) 588-606.

23.- P. C. Meier and R. E. Zund, Statistical Methods in Analytical

Chemistry, John Wiley & Sons, New York, 145-150, 1993.

24.- O. Güell and J.A. Holcombe, Anal. Chem., 60 (1990) 529A - 542A.

25.- P. Fodor and L. Fischer, Fresenius J. Anal. Chem., 351 (1995) 454-455.

26.- D. C. Woollard, J. AOAC Int., 80 (1997) 860-865.

27.- C. Friedrich, K. Cammann and W. Kleiböhmer, Fresenius J. Anal.

Chem., 352 (1995) 730-734.

28.- P. L. Rogers and W. Staruszkiewicz, J. AOAC Int., 80 (1997) 591-602.

29.- R. J. Carroll and D. Ruppert., Amer. Stat., 50 (1996) 1-6.

30.- J. del Río, J. Riu and F. X. Rius, in preparation.

5.5 Conclusions

259

5.5 Conclusions

Com conclusió principal d’aquest capítol cal remarcar que s’ha

aconseguit desenvolupar un procediment que permet comparar els

resultats obtinguts per diversos mètodes analítics quan analitzen una sèrie

de mostres. Aquesta comparació es porta a terme considerant tots els

nivells de concentració simultàniament, així com les incerteses generades

pels errors comesos en la mesura de les mostres analitzades.

Un altre punt important a destacar és que s’ha demostrat mitjançant

conjunts de dades simulats, que aquest procediment per la comparació de

múltiples mètodes proporciona resultats correctes quant a la detecció i

identificació dels mètodes esbiaixats. Això equival a dir que no es detecten

diferències significatives entre els resultats dels diferents mètodes quan

aquestes no existeixen, mentre que quan alguns dels mètodes proporcionen

resultats esbiaixats en comparació als resultats de la resta, aquestos són

detectats correctament. És necessari remarcar que el procediment presentat

en aquest capítol ha estat desenvolupat per comparar els resultats obtinguts

per diversos mètodes d’anàlisi. Això significa que a partir dels resultats

obtinguts no es pot concloure quins dels mètodes analítics proporcionen

resultats esbiaixats, sinó quins mètodes proporcionen resultats

comparables als de la resta. Per poder conèixer quins mètodes

proporcionen resultats esbiaixats, caldria comparar els resultats de

cadascun d’aquests mètodes per separat amb els d’un mètode de referència,

mitjançant el test de confiança conjunta desenvolupat pel mètode de

regressió BLS.

D’altra banda, en l’apartat 5.4 es posa de manifest que aquest

procediment també és capaç de detectar correctament la presència de

mètodes analítics amb resultats que es poden considerar discrepants

(outliers) respecte la resta. Tanmateix ha estat possible la detecció de


260


comparació, en el cas de problemes analítics reals. És necessari recordar

que, com ja ha estat indicat en l’apartat 5.4, tot i que el procediment

desenvolupat sembla apropiat per comparar múltiples mètodes considerat

les incerteses degudes als error comesos en la mesura de les mostres, també

existeixen una sèrie d’inconvenients a tenir en compte quant a la seva

aplicació. Finalment s’ha de dir que tot i que en aquest capítol el

procediment de comparació ha estat aplicat per comparar diversos mètodes

d’anàlisi, també és aplicable a qualsevol problema experimental on

s’obtenen resultats de l’anàlisi de mostres de diverses concentracions amb

les seves incerteses corresponents (p.e. comparació de laboratoris, analistes

o instruments).

5.6 Referències



Elsevier, Amsterdam, 1997.

2.- ISO 5725, Precision of test methods – Determination of repeatability and

reproducibility for a standard test method by inter-laboratory tests, 1994.

3.- AOAC/IUPAC, Journal of the AOAC International, 78 (1995) 143A-160A.

4.- ISO 5725, Accuracy (trueness and precision) of measurement methods and

results, 1994.

5.- IUPAC, Pure and Applied Chemistry, 65 (1993) 2123-2144.


7.- Martens H., Næs T., Multivariate Calibration, Wiley & Sons: Chichester,

1989.

8.- Wold S., Esbensen K., Geladi P., Chemometrics and Intelligent Laboratory

Systems, 2 (1987) 37-52.

9.- Wentzell P.D., Andrews D.T., Hamilton D.C., Faber K., Kowalski B.R.,

Journal of Chemometrics, 11 (1997) 339-366.

5.6 Referències

261

10-. Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John

Wiley & Sons: New York, 1993, 145-150.

11-. Güell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A.

CAPÍTOL 6

Habilitat de predicció utilitzant regressió lineal multivariant considerant errors

en tots els eixos en PCR i MLPCR


265


En el capítol anterior s’ha demostrat la utilitat de l’anàlisi de

components principals de màxima versemblança (MLPCA)1 juntament amb

el test conjunt sobre l’ordenada a l’origen i el pendent de la recta de

regressió BLS per comparar els resultats de múltiples mètodes analítics

conjuntament, considerant les incerteses degudes als errors comesos en les

mesures de les mostres de diferents concentracions.

En aquest capítol es continua treballant en el camp multivariant

considerant els errors comesos en la mesura de les mostres en tots els eixos.

S’hi introdueix una tècnica de regressió multivariant, anomenada mínims

quadrats multivariants (multivariate least squares, MLS), que per estimar els

coeficients de regressió, considera les incerteses degudes als errors comesos

en la mesura de les variables predictores i resposta per les diferents

mostres. En aquest capítol s’abandona la comparació dels resultats

obtinguts per dos o més mètodes analítics, per passar a la calibració

multivariant de dades espectroscòpiques. Tot mètode de calibració

multivariant consta de dues etapes: l’etapa de descomposició de les

mesures espectroscòpiques i l’etapa en què es fa la regressió de les

concentracions de referència sobre els valors resultants de la descomposició

feta en la primera etapa. MLS s’ha aplicat en l'etapa de regressió dels

mètodes de calibració multivariant de components principals (principal

compomemt regression, PCR) i de components principals de màxima

versemblança2 (maximum likelihood principal compomemt regression, MLPCR)

substituint la clàssica tècnica de regressió lineal múltiple MLR, que és

l'extensió de la tècnica de mínims quadrats OLS al camp multivariant.

Al contrari que amb PCR, la tècnica de regressió MLPCR té en

compte les incerteses dels errors comesos en les mesures espectroscòpiques

a l’hora de descompondre la matriu de mesures espectrals original R, en els

Capítol 6. Habilitat de predicció utilitzant regressió...

266

corresponents valors i vectors propis (scores i loadings respectivament). Així,

la quantitat d’informació extreta de cadascuna de les mesures

espectroscòpiques per MLPCR és òptima des d’un punt de vista de màxima

versemblança.2 No obstant això, en l’etapa de regressió en MLPCR s’utilitza

el mètode de MLR, que no té en compte les incerteses degudes als errors

comesos en la mesura dels valors de referència de les propietats d’interès.

Tot i que aquest problema ja s’ha solucionat amb la tècnica de regressió de

variables latents de màxima versemblança (maximum likelihood latent root

regression, MLLRR), aquesta és més complexa i menys intuïtiva que

MLPCR, i la interpretació dels resultats és més difícil.2 Per aquest motiu,

l’objectiu d’aquest capítol és estudiar els errors de predicció comesos tant

per MLPCR com per PCR en substituir la tècnica de regressió MLR per la

de MLS i comparar-los amb els obtinguts mitjançant MLLRR.

A l’apartat 6.2 es detalla el funcionament de l’etapa de regressió en

els mètodes de calibració multivariant de màxima versemblança MLPCR i

MLLRR. L’apartat 6.3 es dedica a destacar les diferències entre els errors de

predicció vertaders i observats. L’apartat 6.4 conté el gruix del treball

tractat en aquest capítol, com a part de l’article Application of multivariate

least squares regression method to PCR and maximum likelihood PCR techniques,

enviat per la seva publicació a la revista Journal of Chemometrics. Per acabar,

a l’apartat 6.5 es presenten les conclusions extretes d’aquest capítol.

6.2 Tècniques de calibració multivariant de màxima

versemblança

Aquestes tècniques de calibració multivariant tenen en compte les

incerteses associades a les mesures espectroscòpiques a l’hora de construir

el model de calibrat multivariant. Les tècniques de MLPCR i MLLRR fan

servir l’anàlisi per components principals (MLPCA), descrit a la secció 5.2

del capítol anterior, per descompondre la matriu R de dimensions m×n que

6.2 Tècniques de calibració multivariant ...

267

conté els perfils espectroscòpics. Aquestes tècniques de calibració

multivariant proporcionen una important millora en l’habilitat de predicció

respecte a tècniques més convencionals com ara PCR. Aquestes millores

són especialment importants quan les mesures espectroscòpiques presenten

heteroscedasticitat (desviacions estàndard no constants), que pot ser

deguda a variacions en la intensitat de la font, transformacions no lineals

en les mesures d’absorbància o variacions en les característiques del soroll

del detector.2

6.2.1 Regressió per components principals de màxima

versemblança (MLPCR)

En aquest apartat ens centrarem en l’etapa de regressió, ja que

l’etapa de descomposició de la matriu de mesures espectrals seguint un

criteri de màxima versemblança, es va tractar en el capítol anterior en

detallar el funcionament de l’algoritme MLPCA. En MLPCR l’etapa de

regressió mitjançant el mètode MLR es fa de forma similar que en PCR i els

coeficients del model de regressió s’estimen segons la mateixa expressió:

yTTTq T1T )( −= (6.1)

on T és una matriu m×p de scores obtinguts mitjançant MLPCA per p factors

o components principals i TT és la seva transposada. A diferència de PCR,

per calcular els scores de les mostres desconegudes en MLPCR s’utilitza una

projecció de màxima versemblança que té en compte les respectives

matrius de variàncies Σunk:

111T1 )( −−−−= VΣVVΣrt unkunkunkunk (6.2)


268

on V és una matriu n×p de loadings obtinguts per MLPCA i runk és un vector

n×p amb els perfil espectroscòpic de la mostra desconeguda. La variable

Σunk és una matriu diagonal n×n (es considera que els errors en les mesures

espectroscòpiques són independents) amb les variàncies corresponents a

les diferents rèpliques de les mesures espectroscòpiques fetes per la mostra

desconeguda. Les concentracions de les mostres desconegudes s’estimen,

igual que en PCR, segons l’expressió:

qtunkunky =ˆ (6.3)

Com que les matrius de variàncies Σunk poden ser diferents per les diverses

mostres desconegudes, no es pot definir un vector de regressió per a totes

les mostres desconegudes com en PCR.

Pel que fa a l'aplicació del mètode MLS a l'etapa de regressió de

MLPCR, cal destacar que les concentracions de les mostres desconegudes

s’estimen amb una expressió diferent a la 5.3, ja que encara que es treballi

amb dades centrades, el model de regressió MLS no ha de tenir

necessàriament ordenada 0, com passa quan s’aplica el mètode MLR sobre

dades centrades (eq. 5.3). Això és degut al fet que MLS troba aquells

coeficients de regressió que fan que el model s’ajusti millor a aquells punts

amb unes incerteses associades més petites. L’expressió a partir de la qual

s’estimen les concentracions de les mostres desconegudes amb MLS

correspon a l’equació 1 de l’apartat 6.3.

6.2.2 Regressió per variables latents de màxima versemblança

(MLLRR)

En aquesta tècnica de calibració multivariant la projecció de les

mesures espectroscòpiques sobre el subespai dels scores, igual com en

MLPCR, es fa mitjançant MLPCA. En aquest cas, però, MLPCA s’aplica

6.2 Tècniques de calibració multivariant ...

269

sobre una matriu augmentada [R|y] amb les mesures espectroscòpiques i

els valors de les concentracions. També és necessària una segona matriu

que contingui les variàncies dels errors comesos tant en les mesures

espectrals (matriu m×n Q) com en les concentracions [Q|var(y)].

Una vegada s’ha portat a terme la descomposició espectral aplicant

MLPCA sobre aquestes dues matrius augmentades, s’obtenen les matrius

m×p de scores i (n+1)×p de loadings T i V respectivament. En MLLRR la

predicció de la concentració de la mostra desconeguda es fa segons

l’expressió:

[ ] [ ] T11T1 )(0|ˆ| VVΣVVΣrr −−−= unkunkunkunkunk y (6.4)

En aquest cas, la variable Σunk és una matriu (n+1)×(n+1) de variàncies del

vector augmentat [ ]unkunk y|r corresponent a la mostra desconeguda.

Aquesta equació proporciona un vector de dimensions 1×(n+1) l’últim

element del qual correspon al valor predit unky . Per entendre com funciona

aquest tipus de predicció, s’ha de pensar que el terme T11T1 )( VVΣVVΣ −−−

unkunk de l’expressió 5.4 representa la projecció de màxima

versemblança de l’espectre de la mostra desconeguda sobre el subespai

dels scores. Com que els loadings s’han obtingut en l’etapa de calibració

aplicant MLPCA sobre la matriu [R|y], contenen informació sobre les

concentracions de les mostres de calibració. Així doncs, si s’assigna a

l’últim element de la diagonal de la matriu Σunk un valor numèricament

equivalent a infinit, el darrer element del vector augmentat [ ]0|unkr no es té

en compte en projectar l’espectre de la mostra desconeguda runk, cosa que

permet predir-ne la concentració. Per aquest motiu el valor en l’última

posició del vector augmentat [ ]0|unkr no serà important i per tant es fixa

igual a 0. En cas de tenir concentracions de diversos analits o propietats a


270

predir, només caldrà augmentar el nombre de zeros del vector augmentat a

l’equació 5.4. i el nombre de valors numèricament equivalents a infinit al

final de la diagonal de la matriu Σunk.

6.3 Errors de predicció

En aquest capítol, l’error de predicció (root mean squared error of

prediction, RMSEP) obtingut pels mètodes de calibració multivariant

utilitzant MLS o MLR s’ha diferenciat en dues classes. Per una banda,

l’error de predicció observat, en el qual es comparen els valors predits per

la tècnica multivariant respecte als mesurats pel mètode de referència.

D’altra banda, en l’error de predicció vertader es comparen els valors

predits respecte als valors de referència vertaders, és a dir, aquells valors

desconeguts a la pràctica que estan lliures d’error (vegeu equació 9 a

l’apartat 6.4).

L’habilitat de predicció d’un model de calibració multivariant se sol

mesurar mitjançat l’error de predicció observat. Atès que aquesta és l’única

forma empírica d’establir la capacitat d’un model multivariant per predir el

valor de la concentració (o qualsevol altra propietat d’interès) de mostres

desconegudes, s’han dedicat molts esforços per millorar la predicció dels

valors de les propietats d’interès i, en conseqüència, per minimitzar el

RMSEP observat. No obstant això, s’ha de tenir en compte que les

concentracions de les mostres de calibració i validació estan afectades per

errors en la mesura, que segons el mètode analític de referència emprat

poden arribar a ser molt importants.3-5 Per aquest motiu, el fet que l’error

de predicció sigui baix no és sinònim que el model de calibració

multivariant sigui el que millor predigui les concentracions vertaderes de

les mostres desconegudes (malgrat no ser observables experimentalment).

6.3 Errors de predicció

271

A l’apartat 6.4 es demostra que el mètode MLS aplicat en l’etapa de

regressió de les tècniques de calibració multivariants proporciona

prediccions de les concentracions de les mostres de calibració més

semblants als valors vertaders que no pas les obtingudes pel mètode MLR.

Això és així perquè el mètode de regressió MLS considera les desviacions

estàndard degudes als errors comesos en la mesura de les mostres de

calibració. D’aquesta forma, el model de regressió estimat per MLS s’ajusta

millor a aquells valors experimentals que presenten unes desviacions

estàndard menors (en què en teoria l’error en la mesura experimental és

menor) i, per tant, són més semblants als valors vertaders. Malauradament,

però, la mesura d’aquesta habilitat de predicció vertadera no és possible en

conjunts de dades reals, ja que es desconeixen els valors de referència

vertaders. És per això que en aquest capítol hem treballat amb dades

simulades, que sí permeten calcular l’error de predicció vertader mitjançant

les diferents tècniques de calibració multivariant utilitzant els mètodes de

regressió MLR i MLS i, per tant, la seva comparació.


272

6.4 Application of multivariate least squares regression

method to PCR and maximum likelihood PCR techniques

(Journal of Chemometrics, enviat per a la seva publicació).





ABSTRACT

Reference analytical methods that provide correct concentration

values are essential for building valid multivariate calibration models. In

some cases reference analytical methods provide concentration values with

high levels of uncertainty, what may lead to the construction of wrong

multivariate calibration models. This paper presents a multivariate least

squares regression method (MLS) for regressing the reference concentration

values on the scores from the decomposition of the spectroscopic data on p

factors or principal components. It considers the uncertainties in the

reference concentration values and/or those in the spectroscopic

measurements. We have replaced the traditional ordinary least squares

regression method (OLS, also known as multiple linear regression, MLR) in

both principal component regression (PCR) and maximum likelihood

principal component regression (MLPCR) by the MLS regression method.

We have compared prediction errors of the true concentration values from

the validation step using the MLS regression method and those obtained

using OLS. The true prediction errors are greatly improved when the MLS

6.4 Application of multivariate least squares ...

273

technique is applied to the multivariate calibration methods for real and

simulated data sets.

INTRODUCTION

Over the last few years multivariate calibration methods have been

used as an alternative to well-established analytical techniques, because

they allow fast and reliable predictions of the concentration of the analyte

of interest in unknown samples with interferences, and this makes them

useful for routine analysis. For this reason, a wide variety of multivariate

calibration techniques have appeared. These include multiple linear

regression (MLR),1 principal component regression (PCR),2 partial least

squares (PLS)3 and latent root regression (LRR).4 The suitability of each

technique depends on the specific chemical problem and the characteristics

of the experimental data. On the other hand, reference methods usually

require longer analysis times and are more expensive. Moreover, in cases in

which the analytical problem or the analytical methodology is complex, the

uncertainty associated with the estimated concentrations in the calibration

set is high.5-7 Consequently, some reference concentration values may be

affected by high measurement errors.

Multivariate calibration models constructed with reference

concentration values that contain high random errors may not be correct.

This can lead to high prediction errors from the validation step and, more

importantly, from future working samples because the ordinary least

squares regression method (OLS, also known as multiple linear regression,

MLR) used in multivariate calibration techniques such as PCR or PLS to

regress the reference concentration values on the scores from the

decomposition of the spectroscopic data on the first p factors or principal

components, considers neither the uncertainty (i.e. level of precision) in the

reference concentrations nor the uncertainty in the spectroscopic data. For


274

this reason, we present a multivariate least squares (MLS) regression

method that can account for the individual uncertainties in either the

spectroscopic data and/or the reference concentration values in order to

estimate the regression coefficients of a p+1 dimensional model correctly.

This technique may be considered an extension of the bivariate least

squares regression method (BLS)8 used in univariate calibration. In this

way, the final multivariate regression model assigns minor weights (i.e. less

importance) to the most imprecise reference concentration and/or

spectroscopic values.

We applied the MLS regression method in the regression stage to

both the PCR and MLPCR9 techniques. In PCR the dimensionality

reduction of the spectroscopic data on p PCs is carried out without

considering the uncertainties from replicate measurements. This is why the

MLS regression method in this case only considers the uncertainties of the

reference concentration values in the calibration set to estimate the

coefficients of the multivariate calibration model. On the other hand,

MLPCR considers the uncertainties of the spectroscopic data projected onto

the p-dimensional scores subspace when performing the dimensionality

reduction. This produces estimates of the spectroscopic measurements that

are more likely to be experimentally observed (i.e. maximum likelihood

estimates). If the OLS method is used in the regression step after the

maximum likelihood decomposition,10 neither the uncertainties in the

scores nor those in the reference concentration values will be taken into

account, what is not optimal from a maximum likelihood point of view. A

maximum likelihood multivariate calibration method based on the latent

root regression technique (MLLRR) was therefore developed9 to account for

the uncertainties in both the spectroscopic and the concentration

measurements. However, MLLRR is more cumbersome and less intuitive

than the MLPCR technique, since it simultaneously performs the

dimensionality reduction and the regression steps on an augmented matrix


275

containing both spectroscopic and concentration values. LRR also has this

disadvantage, which is one of the reasons why this multivariate calibration

method has been virtually ignored in chemistry, compared to other

techniques like PCR. In this paper we prove that true prediction errors

(committed when predicting the true but unobservable concentration

values) using the MLS method in the regression stage of the calibration

process, are similar to those from MLLRR and lower than those from the

conventional OLS (MLR) technique in the regression stage.

To simplify the maximum likelihood spectroscopic decomposition,

we have assumed uncorrelated measurement errors (i.e. diagonal

covariance matrices) in the spectroscopic measurements throughout this

paper. Although this assumption is not correct for real experimental data,

MLPCR has proved its potential under these assumptions.9,10 Moreover, it

has been proved recently that, although it has few practical problems,

MLPCR can also account for correlated measurement errors.11 In any case,

MLS can easily be applied to the regression step independently from the

measurement error assumptions considered in the maximum likelihood

spectroscopic decomposition. However the MLS regression technique does

have some limitations and these should be pointed out from the beginning.

Firstly, inherent to any technique dealing with uncertainties, only the

estimates of the exact measurement error variances are available through

replicate measurements. A second limitation is related to the projection of

the individual uncertainties of the spectroscopic data on the scores

subspace. In this paper we have calculated the projections using the theory

of propagation of errors, which considers that the eigenvectors (i.e.

principal components) are error-free. Although we know that eigenvectors

are also affected by uncertainty, the projection of the spectroscopic

uncertainties into the scores subspace provides an accurate measure of the

precision of the replicate measurements12 which the MLS method needs in

the regression step. We have used one real data set and another simulated

one to demonstrate the advantages of the MLS method over the


276

conventional OLS technique when used in the regression step of the

multivariate calibration methods considered.


Notation

We have used bold uppercase characters to denote matrices, bold

lowercase characters to denote vectors and italic lowercase characters to

denote scalars. Since the MLS regression method can be applied to both

PCR and MLPCR we have made no distinction between the matrices used

by these two multivariate calibration methods. The singular value

decomposition of the m×n matrix R containing the spectroscopic values in a

p-dimensional subspace yields an m×p score matrix T and an n×p loading

matrix V. The scalar p is the pseudorank of R, or the number of observable

components in the mixtures of the calibration set. The individual

spectroscopic uncertainties for the ith sample are contained in the

corresponding diagonal covariance n×n matrix (uncorrelated errors are

considered) iΣ . The projection of this diagonal covariance matrix onto the

scores subspace yields the p×p diagonal matrix Zi. In addition, we have

used the caret ‘∧’ to distinguish between the measured and the predicted

concentrations yi and iy respectively.

Multivariate Least Squares

Multivariate least-squares (MLS), which will be used in the

regression stage to build the multivariate model, is a regression technique

that can be applied on multivariate data considering their individual

uncertainties. Of all the accurate approaches for calculating the coefficients

of the regression model, we selected Lisý’s method8 because of its speed in

estimating the correct results for the regression coefficients and the


277

simplicity of programming its algorithm. In the particular case of

multivariate calibration, this regression method assumes a linear model of

the form:

iii e q y ++= 21 qt (1)

where yi is the ith element (i=1,...,m) of the m×1 vector y of analyte

concentrations and ti is the ith row of matrix T. The term q1 represents the

first element of the (p+1)×1 regression vector q. Vector q2 contains the

remaining p elements of q. These values are, respectively, the intercept and

the slopes of the regression hyperplane that best fits the m points defined

by the coordinates (ti1, ..., tip,yi), in a p+1 dimensional space taking into

account the uncertainties in the spectroscopic and/or reference

concentration measurements. The m×1 vector e contains the residual errors

between the observed and estimated concentration values with

),0(N~ 2ieie σ , as expressed in eq. 2:

21ˆ qt iiiii qyyye −−=−= (2)

The estimate of 2ieσ is 2

ise which will be referred to as the weighting factor,

expressed as the residual variance of the ith point (ti1, ..., tip, yi). The

weighting factor can be expressed using Taylor series, even when the

covariances between the scores and the concentration values in the

calibration set are not zero:

∑ ∑∑∑= +===

+−+=p

j

p

jlilij

p

jiji

p

jjiye ttqqtyqqss

ljjjii1 1

221

21

222

22 ),cov(2),cov(2)Ζdiag( (3)


278

In MLPCR the p×p matrix Zi contains the uncertainties of the spectroscopic

measurements projected onto the scores subspace. By error propagation

this can be expressed as:

( ) 1T −= VΣVΖ ii (4)

Estimates of the uncertainties from eq. 4 are, however, approximate. This is

because this equation does not consider the uncertainty present in the

eigenvectors due to the spectral decomposition. However this expression is

reported to give a good idea of the precision of the spectral replicate

measurements.12 This is the information that the MLS regression method

takes into account to find the regression coefficients of the multivariate

calibration model. If however, no uncertainties are associated to the

spectroscopic measurements (which are assumed by PCR), Zi is reduced to

a matrix of zeros. Therefore, the expression of the weighting factor (eq. 3) is

simplified to 2y

2ii

sse = .

In this way, the MLS method takes into account each of the

individual uncertainties in the weighting factors 2ies to find the regression

coefficients of the multivariate calibration model. In other words, the

regression hyperplane is better fitted to those points with lower

uncertainties (lower measurement errors and higher precision), in such a

way that minimises the sum of weighted residuals, S, expressed as:

( ) ( ) ( )∑∑==

−−=−=−=m

iii

e

m

iii

e

qys

yys

msSii 1

221

2

1ˆ12 qt1 (5)

Variable s2 is the estimate of the residual mean square error, also known as

experimental error, and provides a measure of the dispersion of the data

pairs around the regression line hyperplane. By minimising the sum of the


279

weighted residuals S in relation to the regression coefficients in q, p+1

nonlinear equations are obtained. By including the partial derivatives of the

squared residuals, eq. 6 can be written in the equivalent matrix form

expressed in eq. 7:

gBq= (6)

∂

∂

+

∂

∂

+

∂

∂

+

∂

∂

+

=

×

∑

∑

∑

∑

∑∑∑∑

∑∑∑∑

∑∑∑∑

∑∑∑∑

=

=

=

=

====

====

====

====

m

i

e

e

i

e

pii

m

i

e

e

i

e

ii

m

i

e

e

i

e

ii

m

i

e

e

i

e

i

m

i e

pim

i e

piim

i e

piim

i e

pi

m

i e

piim

i e

im

i e

iim

i e

i

m

i e

piim

i e

iim

i e

im

i e

i

m

i e

pim

i e

im

i e

im

i e

p

i

ii

i

ii

i

ii

i

ii

iiii

iiii

iiii

iiii

qs

se

sty

qs

se

sty

qs

se

sty

qs

se

sy

q

st

stt

stt

st

stt

st

stt

st

stt

stt

st

st

st

st

st

s

1 2

22

22

1 2

22

222

1 2

22

221

1 1

22

22

2

1

12

2

12

3

12

2

12

12

3

12

23

1232

12

2

12

2

1232

12

22

12

1

12

12

2

12

1

12

21

21

21

21

1

2

1

…

…

…

…

q (7)

where the vector containing the estimates of the slopes of the regression

hyperplane is p

qq 222 ,...,1

=q . The regression coefficients, (i.e. elements of

vector q), can be determined by carrying out an iterative process on the

following matrix form:

gBq 1−= (8)

With this method, the variance-covariance matrix of the regression

coefficients is obtained by multiplying the final matrix B-1, by the estimate

of the experimental error 2s (eq. 5). It should be pointed out that, if 2ies

was constant for all the samples (i.e., there are only homoscedastic errors in


280

the concentrations and the spectroscopic uncertainties were neglected), the

expressions obtained would be the same as if the OLS (MLR) regression

method were applied.

The MLS regression model extracts more information from data

pairs that are supposed to have lower measurement errors (lower

uncertainties). It should be noted that the linear model assumed by the OLS

regression method in eq. 1 considers a zero intercept when both the

spectroscopic and the concentration values are mean centred. This is not

the case with centred data for the MLS regression method, where the

regression coefficients, and therefore the intercept, depend on the

uncertainties in the concentration values and/or in the spectroscopic data

(which are projected onto the scores subspace according to eq. 4). In this

way, the MLS regression hyperplane will fit the most precise data pairs

better, which does not necessarily ensure a zero intercept specially when

highly heteroscedastic data is handled.

Prediction Errors

In this paper we have distinguished between two types of

prediction errors. The first of these refers to the true but unobservable

concentration values (true prediction error), which can be expressed as:

test

N

iii

N

ytest

∑=

−= 1

2)ˆ(RMSEP true

η (9)

where iη is the true concentration value of a given component in the ith

sample and Ntest is the number of samples used for validation. Although,

theoretically speaking, the true concentration values are unknown, we have

considered values obtained with a high degree of accuracy, such as those


281

from the dilution of stock solutions in data set 1, to be the true

concentrations. To show how the MLS regression method can improve the

true prediction error obtained with OLS, we have generated new

spectroscopic and concentration data using the Monte Carlo simulation

method.13,14 This method requires that the error-free spectral matrix and the

corresponding concentration values be known, which was only possible in

data set 1. This simulation method generates the new spectral matrices by

adding random errors to the error-free spectral matrix based on the

individual uncertainty (i. e. standard deviation) of the spectroscopic

measurements at each wavelength. Analogously, the new concentrations

are generated by adding a random error to the error-free concentrations

that depends on the set standard deviation. In this way the new

spectroscopic and concentration values with higher measurement errors

will be those with higher standard deviations and viceversa. Since the MLS

regression method gives a greater weight to the spectroscopic (scores) and

concentration values with lower standard deviations (the most similar to

the true ones) and viceversa, predictions of the true concentration values

are better than with the OLS regression method.

The second type of prediction error we considered measures the

ability of the multivariate model to predict the measured concentrations

(observed prediction error, RMSEP). It is analogous to the one in eq. 9, but

takes into account the measured concentration values iy instead of the true

ones.


Data Sets

Data Set 1. Data set used elsewhere9 to compare the prediction error

from MLPCR and MLLRR techniques with the ones from other


282

multivariate calibration methods and has been downloaded from P.

Wentzell Research Group web site15 This data set was obtained through a

carefully designed experiment involving three-component mixtures of

metal ions (Co+2, Cr+3, Ni+2), a system suggested by Osten and Kowalski16.

The three spectral profiles of the pure components are shown in Figure 1a.

The spectra of the metal ion mixtures in Figure 1b contain nonuniform

noise produced by a dichroic band-pass filter placed between the source

and the sample to decrease the source intensity at high and low

wavelengths for all measurements.

To compare the true prediction ability of MLPCR using MLS or OLS

regression methods in the regression stage and MLLRR, 200 calibration and

200 test sets of 128 samples each were generated with the Monte Carlo

simulation method. To generate new calibration and test sets, this

simulation method was applied on an error-free spectral matrix (Fig. 1c)

obtained from multiplying of the 128x3 measured concentration matrix by

the 3x150 pure component spectra matrix (Fig. 1a). The random errors that

were added to these error-free spectra matrices to obtain the new noisy

spectra matrices had the standard deviation profile shown in Figure 1d.

The new concentration values for the calibration and test sets were also

generated by applying the Monte Carlo method to the measured

concentrations. As stated earlier, because the measured concentration

values were obtained by diluting the stock solutions, we may assume that

they are very similar to the true ones (i.e. very low measurement errors are

committed during dilution). Since no standard deviations were available

for the measured concentration values, uncertainty levels of 1%, 5%, 10%,

15% and 20% were considered for generating the new concentration values

for each simulated calibration and test sets.


283

350 400 450 500 550 600 6500

1

1.6

Wavelength (nm)

Abs

orba

nce

CrCoNi

350 400 450 500 550 600 650

-0.4

0

0.6

Wavelength (nm)

Abs

orba

nce

350 400 450 500 550 600 650

0

0.3

0.6

Abs

orba

nce

Wavelength (nm)

350 400 450 500 550 600 6500

0.1

0.2

0.3

Wavelength (nm)

Abs

orba

nce

Figure 1. Spectral profiles for data set 1: (a) Pure component spectra for the three metal ions, (b) noisy spectra for metal ion mixtures used for calibration, (c) Error-free spectra for metal ion mixtures used for calibration and (d) standard deviation profile of the noisy spectra matrix.

Data Set 2. To compare the performance of the MLS regression

method with OLS (MLR) when applied to MLPCR and MLLRR and also to

PCR, a simulated data set was generated in a similar way to that reported

elsewhere9. This data set reproduces the spectral profiles of a three-

component mixture. The pure spectra comprises three Gaussian curves for

each component, centred at 480, 500 and 520 nm respectively. The width

(standard deviation) of each curve is 20 nm. Spectroscopic measurements

were taken every 5 nm within the wavelength range from 400 to 600 nm

(Figure 2a). Reference concentration values were randomly generated from

a uniform distribution between 0 and 1 for each one of the three

components.

The calibration and validation sets are made up of 20 and 100

samples respectively. The Monte Carlo simulation method is used to

randomly generate 10,000 calibration and test sets by adding measurement

a) b)

c) d)


284

errors to the error-free spectral matrices and error-free concentration values

of each of the three components. Figure 2b shows the spectral profile of the

error-free calibration matrix. Measurement errors for the spectral

measurements had both constant and proportional terms. The standard

deviation of the constant term was 1% of the maximum value of the

spectroscopic measurements. The proportional term was set at 2% of the

error-free spectroscopic values. The standard deviation σij (1<i<m, and

1<j<n) for each spectroscopic measurement Aij was obtained from the

expression 22 )02.0(01.0 ijij A+=σ . Since this standard deviation

structure is much less complex than the one in data set 1 (Figure 1d), the

maximum likelihood spectral decomposition algorithm is much faster. This

allows a dramatic increase in the number of iterations, from 200 to 10,000.

To show the differences between the prediction errors from the

multivariate calibration techniques using OLS (MLR) and MLS regression

methods and the MLLRR technique, proportional errors with standard

deviations (σy) of 1, 5, 10, 15 and 20 percent, respectively, were added to the

error-free concentration values in both calibration and test sets.

400 500 600480 520

a) 1 2 3

Wavelength (nm)

400 500 400

b)

Wavelength (nm)

Figure 2. Spectral profiles for data set 2: (a) Simulated pure components spectra and (b) error-free spectra of the simulated mixtures used for calibration.


285

Computational Aspects

Our calculations were performed with a Pentium III-based personal

computer with 64 Mb of memory and a clock speed of 500 Mhz. All the

algorithms were written in Matlab (Matlab for Microsoft Windows ver. 5.2,

The Mathworks, Inc., Natick, MA).


Data Set 1. Figure 3 shows the true errors of prediction for the three

components (Cr+3, Ni+2 and Co+2). They are the mean values of 200

iterations using the Monte Carlo simulation method for each standard

deviation value associated to the reference concentrations values. In all

cases the optimal number of factors was 3.

1 5 10 15 20

0.5

1.5

2.5

3.5 x 10-4

MLPCR(MLS)MLPCR(OLS)MLLRR

% Standard deviation in Concentration

true

RM

SEP

Cr+3

Figure 3. Mean values of the 200 true RMSEPs for Cr+3 generated for each standard deviation level of the measurement errors added to the error-free concentrations.


286

1 5 10 15 20

0.2

0.6

1

1.4

1.8x 10-3



true

RM

SEP

Ni+2

1 5 10 15 20

1

3

5

7

x 10-4



true

RM

SEP

Co+2

Figure 3 (cont.). Mean values of the 200 true RMSEPs for Cr+3 and Co+2 generated for each standard deviation level of the measurement errors added to the error-free concentrations.

As expected, the true prediction errors increase as the standard

deviations of the measurement errors added to the error-free concentration

values increase in all three cases. Predictions of the true concentration

values using MLPCR with MLS and MLLRR are much better than those


287

with MLPCR when using the OLS regression method. These results are

logical because in the regression step MLPCR with both MLS and MLLRR

takes into account the uncertainties of the scores (from the maximum

likelihood decomposition of the spectra matrix) and the reference

concentration values. In this way, the regression models of the three

analytes give a larger weight (i.e. have a better fit) to the concentration

values in the calibration set with lower uncertainties, which are the most

similar to the true concentration values. Although in this case the

differences are small, MLLRR produced higher true prediction errors than

MLPCR with MLS. The tiny differences in the prediction errors between

the two techniques arise because MLLRR implicitly assumes that the

regression model has zero intercept (q1 in eq. 1). As stated in the previous

section, this is not the same for the MLS regression method, since

uncertainties considered in both spectroscopic and concentration values

make the regression hyperplane fit those data pairs with lower individual

uncertainties better, which does not ensure a 0 intercept term in eq. 1.

Results from PCR have been omitted from this data set because the

highly heteroscedastic uncertainty structure of the measurement errors in

this data set makes PCR not a suitable multivariate calibration method.

This is confirmed by the poor prediction ability of PCR that has been

thoroughly discussed elsewhere9.

Data Set 2. Figure 4 shows the mean true errors of prediction

(considering 10,000 iterations) from five multivariate calibration techniques

using both MLS and OLS (MLR) regression methods for the first and

second components in the three-component mixtures. True prediction

errors for the third component in the mixtures are omitted because by

symmetry they are statistically equivalent to those for the first component.

In all cases, the optimum number of PCs was 3. The lowest prediction

errors were always produced by MLPCR using the MLS regression


288

method. This is because when these two techniques are combined, greater

importance is given to those scores and concentration values with lower

uncertainties (the most similar to the true ones) in the estimation of the

calibration model coefficients. In this way, the calibration model provides

better predictions of the true concentration values. The tiny differences in

the prediction errors between MLPCR with MLS and MLLRR arise for the

same reason as in data set 1. Results for PCR with the MLS regression

method were similar to MLPCR with MLS or MLLRR because

measurement errors in the spectroscopic values in this case are low. This

makes the scores from the singular value decomposition of the maximum

likelihood spectra estimates similar to those from the decomposition of the

measured spectra values.

1 5 10 15 20

0.015

0.025

0.035

0.045

0.055


True

RM

SEP

MLPCR(MLS)MLPCR(OLS)MLLRRPCR(MLS)PCR(OLS)

Component 1

Figure 4. Mean values of the 10,000 true RMSEPs from the three simulated component mixtures generated for each standard deviation level of the measurement errors added to the error-free concentrations.


289

1 5 10 15 20

0.02

0.03

0.04

0.05

0.06


True

RM

SEP

Component 2


Figure 4 (cont.). Mean values of the 10,000 true RMSEPs from the three simulated component mixtures generated for each standard deviation level of the measurement errors added to the error-free concentrations.

On the other hand, the highest prediction errors, specially in the σy

range between 5% and 20%, were provided by PCR and MLPCR using the

OLS regression method. This is because OLS does not account for the

uncertainties in the reference concentration values. In this way,

concentration values with high measurement errors are considered in the

calibration step, which degrades the prediction ability of the final

calibration model. The differences between the prediction errors of the two

multivariate calibration techniques are similar to those when using the

MLS regression method, for the same reasons as with data set 1.

Figure 5 shows the observed RMSEPs mean values from the 10,000

iterations. These prediction errors are higher than the true ones shown in

Figure 4.


290

1 5 10 15 20

0.02

0.06

0.1

0.14


obse

rved

RM

SEP


Component 1

1 5 10 15 20

0.02

0.06

0.1

0.14

obse

rved

RM

SEP



Component 2

Figure 5. Mean values of the 10,000 observed RMSEPs from the three simulated component mixtures generated for each standard deviation level of the measurement errors added to the error-free concentrations.

It is therefore clear that multivariate calibration methods provide

better predictions of the true concentration values than of the

experimentally observed ones. In other words, predictions of the true

values from the multivariate calibration methods are more accurate than


291

those from the reference method17, because random errors in the

concentration and/or spectroscopic measurements are averaged by the

calibration models. This is specially clear for the multivariate calibration

methods using the MLS regression technique. This regression method

extracts more information from the concentration values with lower

uncertainties, which are the most similar ones to the true concentration

values. Moreover, although in this example the observed RMSEPs from

MLS are lower than those from OLS, this is not always the case. The MLS

regression model estimated by considering the uncertainties does not

necessarily improve the prediction errors for the measured reference

concentrations since these reference concentrations contain measurement

errors. The concentrations estimated with MLS are more similar to the true

values than to the measured reference concentration values. For this reason,

there may be cases in which the observed RMSEP from MLS is higher than

the one from OLS (MLR).

CONCLUSIONS

This paper presents a new multivariate least squares regression

method (MLS) that estimates the regression model coefficients by taking

into account the uncertainties of the individual values in all the axes. We

have applied this regression method to two types of multivariate

calibration techniques (PCR and MLPCR) to show the prediction ability of

the true and measured concentration values when the uncertainties in both

the scores and the concentration values are considered (MLPCR conditions)

and when only the uncertainties in the concentration values is considered

(PCR conditions). We have compared both the true and observed

prediction errors made with these multivariate calibration techniques using

the MLS regression method to the corresponding prediction errors made

with OLS (MLR) and MLLRR.


292

Results using the MLS regression method in PCR and MLPCR show

that the prediction error of the true concentration values is considerably

lower than the true prediction error when using OLS. Although the true

RMSEP with MLS is similar to the one with MLLRR, the use of the MLS

regression method with MLPCR is more suitable than MLLRR, since

MLPCR is a simpler and more intuitive multivariate calibration method9.

Moreover, we have also shown that the observed prediction errors

of the measured concentrations are not necessarily lower with MLS than

with OLS. Although this variable is the only measure of prediction ability

for real data, low observed RMSEPs should not be the ultimate goal of the

researcher. Rather, more attention should be paid to constructing

multivariate calibration models that provide the best possible estimates of

the true concentrations, since lower observed RMSEPs do not necessarily

mean a better prediction ability of the true concentration values from the

multivariate calibration model.18

Finally, two important points concerning the MLS regression

method should be noted. Firstly, since uncertainties from the replicate

analysis of the different concentration samples in the calibration set must

be known, a greater experimental effort than with OLS is required.

Secondly, the MLS regression technique is not very robust in the presence

of outliers with low individual uncertainties. It is therefore important to

search for possible oultying samples with low uncertainties in the

concentrations.

ACKNOWLEDGEMENTS

The authors would like to thank the DGICyT (project no. BP96-1008)

for financial support, and the Rovira i Virgili University for providing a

doctoral fellowship to A. Martínez.


293

BIBLIOGRAPHY

1.- N. Draper and H. Smith, Applied Regression Analysis, 2nd ed.: John Wiley

& Sons: New York, 5-128 (1981).

2.- K. R. Beebe and B. R. Kowaslki, Anal. Chem., 59, 1007A-1017A (1987).

3.- S. Wold, Systems Under Indirect Observation, Part II, North Holland

Publishing Co., Amsterdam, 1-54 (1982).

4.- E. Vigneau and D. Bertrand and E. M. Qannari, Chemom. Intell. Lab. Syst.,

35, 231-238 (1996).

5.- J. D. Hall, B. McNeil, M. J. Rollins, I. Draper, B. G. Thompson and G.

Macaloney, Appl. Spectrosc., 50, 102-108 (1996).

6.- T. Fearn, Appl. Stat., 32, 73-79 (1983).

7.- A. H. Aastveit and P. Marum, Appl. Spectrosc., 49, 67-75 (1995).

8.- J.M. Lisý, A. Cholvadová and J. Kutej, Comput. Chem., 14, 189-192 (1990).

9.- P. D. Wentzell and D. T. Andrews, Anal. Chem., 69, 2299-2311 (1997).

10.- P. D. Wentzell, D. T. Andrews, D. C. Hamilton, K. Faber and B. R.

Kowalski, J. Chemom., 11, 339-366 (1997).

11.- P. D. Wentzell and M. T. Lohnes, Chemom. Intell. Lab. Syst., 45, 65-85

(1999).

12.- P. D. Wentzell and D. T. Andrews, Anal. Chim. Acta, 350, 341-352 (1997).


John Wiley & Sons, New York, 145-150 (1993).

14.- O. Güell and J. A. Holcombe, Anal. Chem., 60, 529A-542A (1990).

15.- http://www.dal.ca/~pdwentze/home.htm

16.- D.W. Osten and B.R. Kowalski, Analytical Chemistry, 57, 908-915

(1985).

17.- R. DiFoggio, Appl. Spectr., 49, 67-75 (1995).

18.- U. H. Olsson, S. V. Troye and R. D. Howell, Multivariate Behavioral

Research, 34(1), 31-58 (1999).


294

6.5 Conclusions

En aquest capítol s’ha desenvolupat un mètode de regressió

multivariant (MLS) que considera les incerteses degudes als errors comesos

en la mesura de les diferents mostres. S’ha demostrat que aquest mètode de

regressió és fàcilment aplicable a l’etapa de regressió de dues importants

tècniques de calibració multivariant com són PCR i MLPCR. Mitjançant

MLS hem aconseguit millorar sensiblement els errors de predicció

vertaders tant en PCR com en MLPCR respecte als obtinguts utilitzant el

mètode de regressió MLR.

En el cas de MLPCR, els valors dels errors de predicció vertaders

utilitzant el mètode de regressió MLS són similars i fins i tot lleugerament

inferiors als obtinguts per MLLRR. A més, tant la interpretació dels

paràmetres del model multivariant obtinguts amb MLPCR utilitzant MLS,

com l’ús en termes generals, són més fàcils que emprant el mètode

MLLRR.2

Finalment cal remarcar que tot i que el mètode de regressió

multivariant MLS proporciona millors errors de predicció vertaders,

aquesta millora només és observable en dades simulades, en les quals es

coneixen els valors vertaders de les propietats d’interès. En conjunts de

dades reals, els valors experimentals obtinguts pel mètode de referència

incorporen un error de mesura que en alguns casos (per exemple, en la

determinació de diverses propietats en gasolines) pot ser important. Per

aquest motiu els valors predits de les propietats d’interès mitjançant la

tècnica de regressió MLS no tenen perquè ser més semblants als valors

obtinguts pel mètode de referència que quan s’utilitza MLR. En

conseqüència, és possible que l’error de predicció observat mitjançant la

tècnica de regressió MLS no sigui millor per a un determinat conjunt de

6.5 Conclusions

295

validació que l’obtingut amb MLR, encara que els resultats obtinguts amb

MLS s'acostaran més als valors reals que els obtinguts amb MLR.

6.6 Referències

1.- Wentzell P.D., Andrews D.T., Hamilton D.C., Faber K., Kowalski B.R.,

Journal of Chemometrics, 11 (1997) 339-366.

2.- Wentzell P.D., Andrews D.T., Analytical Chemistry, 69 (1997) 2299-2311.

3.- Hall J.D., McNeil B., Rollins M.J., Draper I., Thompson B.G., Macaloney

G., Applied Spectroscopy, 50 (1996) 102-108.

4.- Fearn T., Applied Statistics, 32 (1983) 73-79.

5.- Aastveit A.H., Marum P., Applied Spectroscopy, 49 (1995) 67-75.

CAPÍTOL 7

Conclusions

7.1 Conclusions generals

299


En aquest capítol es presenten les conclusions generals que s’han

extret d’aquesta tesi doctoral a partir dels objectius plantejats en l’apartat

1.1 de la Introducció. Per aquest motiu, la discussió de les conclusions es fa

seguint la mateixa estructura.

♦Revisió crítica de les tècniques de regressió lineal emprades per estimar els

coeficients de regressió.

En l’apartat 1.4.1 de la Introducció s’han presentat tres de les

tècniques de regressió més utilitzades quan les incerteses associades als

valors en l’eix d’abscisses (x) són negligibles respecte a les incerteses

associades als valors en l’eix d’ordenades (y). Aquestes tècniques es

coneixen amb el nom de mínims quadrats ordinaris (OLS), mínims

quadrats ponderats (WLS) i mínims quadrats generalitzats (GLS). Per

poder aplicar correctament aquestes tècniques de regressió és fonamental

tenir en compte el tipus d’incerteses degudes als errors comesos en la

mesura de les variables resposta situades sobre l’eix d'ordenades.

La tècnica de regressió OLS és la més emprada perquè presenta una

sèrie de propietats matemàtiques que són ben conegudes (apartat 1.4.1.1) i

a la seva senzillesa i rapidesa en l’estimació dels coeficients de regressió. En

el cas que les incerteses associades als valors de la variable resposta no

siguin iguals en tots els punts experimentals (existència

d’heteroscedasticitat), aquest mètode de regressió no proporciona

estimacions correctes ni dels coeficients de regressió ni de les seves

variàncies. Sota aquestes condicions experimentals el mètode WLS

representa una millora respecte a OLS, ja que considera l’heteroscedasticitat

de la variable resposta. En cas que també s’hagi de considerar la correlació

entre les variàncies 2iys dels diversos punts experimentals degudes als

Capítol 7. Conclusions

300

errors comesos en la mesura de la variable resposta (covariància), les

millors estimacions dels coeficients de regressió s’obtenen mitjançant el

mètode GLS. Aquests mètodes de regressió es poden utilitzar en la

comparació de mètodes analítics quan les incerteses degudes als errors

comesos en la mesura de les mostres proporcionades per un dels dos

mètodes en comparació són negligibles respecte a les incerteses de l’altre

mètode. En aquests casos els valors experimentals del mètode amb més

precisió es col·locaran en l’eix d'abscisses, mentre que els valors obtinguts

per l’altre mètode han de ser a l’eix d’ordenades.

D’altra banda, quan les incerteses associades a l’eix d'abscisses no

són negligibles en comparació a les de l’eix d’ordenades, l’aplicació dels

mètodes de regressió esmentats anteriorment no està justificada

estadísticament. En aquests casos és necessari emprar mètodes de regressió

que considerin les incerteses dels valors experimentals en els dos eixos. En

l’apartat 1.4.1.2 s’han presentat tres mètodes de regressió adequats a

aquestes condicions experimentals. Aquests mètodes es coneixen com

regressió per relació de variàncies constant (CVR), regressió ortogonal (OR)

i mínims quadrats bivariants (BLS). Si la relació entre les incerteses dels

valors experimentals de les mostres analitzades pels dos mètodes és

constant, és convenient utilitzar el mètode de regressió CVR, ja que estima

els coeficients de regressió seguint un criteri de màxima versemblança. En

el mètode CVR cal fixar la variable λ (apartat 1.4.1.2), que correspon a la

relació de les variàncies dels valors experimentals obtinguts pels dos

mètodes. Un cas particular de CVR és el mètode de regressió OR i es dóna

quan λ=1. Finalment, el mètode de regressió BLS és indicat quan les

incerteses degudes als errors comesos en l’anàlisi de les mostres mitjançant

els dos mètodes en comparació són diferents per a cadascun dels valors

experimentals en els dos eixos. D’entre tots els mètodes de regressió que

consideren les incerteses individuals en els dos eixos, es va triar el mètode

de Lisý i col·laboradors, ja que no només dóna estimacions correctes dels


301

coeficients de regressió, sinó que proporciona la matriu de variància-

covariància dels coeficients de regressió, de gran utilitat per aplicar de

diversos tests estadístics. D’altra banda, el mètode de regressió de Lisý i

col·laboradors, al contrari que el mètode de regressió multivariant MLR,

també permet estimar els coeficients del hiperplà de regressió en un espai

de més dues dimensions considerant les incerteses individuals associades

als valors experimentals en tots els eixos (mètode de regressió multivariant

MLS). S’ha de destacar la importància de tenir bones estimacions de les

incerteses dels errors experimentals en els mètodes de regressió BLS i MLS.

Això implica fer prou rèpliques en l’anàlisi de les diferents mostres. Tot i

això, les incerteses dels errors comesos en les mesures experimentals

estimades mitjançant rèpliques poden incloure fonts de variació no

relacionades amb els errors aleatoris comesos en l’anàlisi de les mostres. En

aquests casos, els mètodes de regressió BLS i MLS, com tots els altres

mètodes de regressió, donen estimacions esbiaixades dels coeficients de

regressió. Per aquest motiu l’ús d’aquests mètodes de regressió requereix

que les fonts de variabilitat que poden afectar les mesures experimentals

estiguin controlades. El diagrama de flux següent esquematitza les

condicions d’aplicació de les diferents tècniques de regressió lineal tant

univariant com multivariant descrites en el primer capítol d’aquesta tesi

doctoral:


302

DadesExperimentals

Més de dues variables

predictores?

Errors en totes les

variables?

MLR

MLS

Errors enla variablepredictora?

Errors constants en la variableresposta?

Covariànciaentre iy

WLS

Relació d’errors constant?

BLS

λ=1?CVR OR

GLS

OLS

Sí Sí

No

No

No

Sí No

No Sí

Sí

Sí

Sí

No

No

Esquema 7.1. Condicions d’aplicació dels diferents mètodes de regressió lineal en funció de les incerteses associades a les variables predictora i resposta degudes als errors experimentals comesos en la mesura de les mostres.

♦Desenvolupament i validació d’un test estadístic per detectar la falta d’ajust dels

resultats experimentals a la recta de regressió.

S’han desenvolupat i validat dos tests estadístics per la detecció de

falta d’ajust, tal com es mostra al segon capítol d’aquesta tesi doctoral. A


303

partir de conjunts de dades simulats ha estat possible demostrar que la

capacitat del test F sota condicions de regressió BLS per detectar

correctament falta d’ajust és superior a la mostrada pel test χ2 . Amb els

conjunts de dades simulats també es va concloure que per poder detectar

correctament l’existència de falta d’ajust dels punts experimentals a la recta

de regressió BLS es necessita un nombre força elevat de rèpliques en

l’anàlisi de les mostres. Tot i que aquests resultats no són gaire aplicables

sota condicions experimentals reals, ha estat possible descriure els

avantatges i els inconvenients del test estadístic desenvolupat. El

coneixement de les limitacions d’aquests tests per detectar la falta d’ajust

dels valors experimentals a la recta de regressió BLS, pot proporcionar una

informació addicional important a l’hora d’establir el disseny experimental

que s’ha de seguir.

♦Desenvolupament i validació d’expressions matemàtiques per estimar les

probabilitats de cometre errors de primera i segona espècie, en l’aplicació de tests

individuals sobre els coeficients de regressió.

En el tercer capítol s’han tractat diferents aspectes referents a

l’aplicació de tests individuals sobre els coeficients de regressió estimats

mitjançant el mètode BLS. S’ha de destacar la importància de considerar les

probabilitats de cometre un error β en l’aplicació d’aquests tests

individuals, ja que les conseqüències d’assumir probabilitats d’error β

elevades poden arribar a ser molt greus, segons el problema analític. Per

aquest motiu s’ha demostrat mitjançant conjunts de dades simulats que les

expressions matemàtiques desenvolupades per estimar la probabilitat de

cometre un error de tipus β són correctes.

També cal insistir en la importància de l’estimació del nombre de

mostres necessàries per construir la recta de regressió mitjançant el mètode

de regressió BLS, de manera que el risc de cometre errors α i β a l’hora de


304

detectar un cert biaix en un dels coeficients de regressió mitjançant un test

individual estigui controlat. A causa de la natura iterativa, aquest

procediment de càlcul pot resultar una mica complicat en algunes ocasions.

Tot i això, el seu ús és molt recomanable en aquelles situacions en què les

conseqüències de cometre errors de tipus α i/o β siguin especialment

problemàtiques pels problemes analítics tractats.

♦Estudi de la detecció d’un biaix significatiu en els resultats de mètodes analítics

capaços d’analitzar diferents analits alhora mitjançant regressió lineal.

En aquest estudi, a partir de conjunts de dades simulats ha estat

possible demostrar que la detecció de diferències significatives entre els

resultats dels dos mètodes en comparació es realitza de manera correcta

quan es consideren tots els resultats de l’anàlisi dels diferents analits

alhora. Això equival a dir que quan s’aplica el test conjunt sobre els

coeficients de regressió BLS estimats a partir dels conjunts individuals (que

contenen els resultats experimentals de cadascun dels analits per separat),

la detecció de diferències significatives entre els resultats dels dos mètodes

és molt difícil. En aquests casos existeix una elevada probabilitat de

cometre un error β, i en conseqüència, de considerar com a correcte un

mètode analític esbiaixat. Per minimitzar aquest risc, és necessari entendre

que l’estimació dels coeficients de regressió BLS amb un nombre baix de

valors experimentals produeix sobreestimacions de l’error experimental s2

que generen, per un determinat nivell de significança α, intervals de

confiança sobredimensionats.

D’altra banda, atesa la importància de les conseqüències que es

poden derivar d’un error de tipus β en l’aplicació del test conjunt sobre els

coeficients de regressió, s’han desenvolupat expressions matemàtiques per

fer possible l’estimació de la probabilitat de cometre aquest tipus d’error

quan el mètode de regressió considera les incerteses degudes als errors


305

experimentals en els dos eixos (BLS). També s’ha comprovat mitjançant

conjunts de dades simulats que una bona estimació de l’error experimental

s2 permet estimar de manera correcta la probabilitat de cometre un error β

mitjançant les expressions matemàtiques desenvolupades.

♦Desenvolupament i validació d’una tècnica per la comparació dels resultats de

múltiples mètodes d’anàlisi que consideri les incerteses dels resultats analítics.

En el cinquè capítol s’ha presentat un procediment que permet

comparar simultàniament els resultats a diversos nivells de concentració de

més de dos mètodes analítics, considerant les incerteses generades en

l’anàlisi de les mostres de diverses concentracions. Ha estat possible

demostrar mitjançant conjunts de dades simulats que aquest procediment

per la comparació de múltiples mètodes proporciona resultats correctes

quant a la detecció dels mètodes analítics que donen resultats esbiaixats.

D’altra banda, aquest procediment també és capaç de detectar

correctament la presència de mètodes analítics amb resultats que es poden

considerar discrepants (outliers) respecte la resta. El procediment per

comparar múltiples mètodes analítics també s’ha aplicat a conjunts de

dades reals. Això ha permès comprovar que tant la identificació de mètode

analítics que poden ser considerats com a outliers com la detecció de


comparació, és congruent amb les dades experimentals observades.

♦Estudi sobre la millora de l’habilitat de predicció en mètodes de calibració

multivariant mitjançant una tècnica de regressió multivariant que considera les

incerteses en tots els valors experimentals

Ha estat possible desenvolupar una tècnica de regressió

multivariant (MLS) que estima els coeficients de l’hiperplà de regressió


306

considerant les incerteses associades tant a les concentracions com als scores

generats per la descomposició de les dades espectrals. El mètode de

regressió MLS és fàcilment aplicable a l’etapa de regressió de dues

importants tècniques de calibració multivariant com són PCR i MLPCR.

Mitjançant MLS hem aconseguit millorar sensiblement els errors de

predicció vertaders en les dues tècniques de calibració multivariant

esmentades respecte als obtinguts utilitzant el mètode de regressió clàssic

MLR.

D’altra banda, l’aplicació del mètode de regressió MLS a MLPCR

permet obtenir errors de predicció vertaders molt semblants als obtinguts

mitjançant MLLRR. Aquesta darrera tècnica de calibració multivariant de

màxima versemblança també considera les incerteses degudes als errors

comesos en la mesura de les concentracions durant l’etapa de regressió. No

obstant això, tant la interpretació dels diferents paràmetres del model

multivariant obtinguts amb MLLRR com l’ús en termes generals és força

més complex en comparació amb el mètode MLPCR.

Un altre punt que cal tenir molt en compte és el fet que tot i que el

mètode de regressió multivariant MLS proporciona millors errors de

predicció vertaders, aquesta millora només és observable quan es coneixen

els valors vertaders de les propietats estudiades, és a dir, en conjunts de

dades simulats. En conjunts de dades reals, les concentracions mesurades

pel mètode de referència incorporen errors experimentals; per tant, les

concentracions predites utilitzant MLS no tenen perquè ser més semblants

a les de referència que les concentracions predites amb MLR. En

conseqüència, no és difícil obtenir amb la tècnica de regressió MLS un pitjor

error de predicció per a un determinat conjunt de validació que l’obtingut

amb MLR, malgrat que és molt probable que els resultats obtinguts amb la

tècnica de regressió MLS s'acostin més als valors reals.


307

♦Generació d’algoritmes informàtics per facilitar l’aplicació pràctica dels tests

desenvolupats.

Tots els càlculs fets per a aquesta tesi doctoral s’han realitzat amb

subrutines programades per MATLAB versió 4.0 (Matlab per a Microsoft

Windows, The Mathworks Inc., Natick, MA). Aquest programa de càlcul

permet treballar de forma senzilla i ràpida amb matrius de dades que

poden arribar a tenir dimensions força elevades. Tot i que aquests

algoritmes no es presenten en forma de text per raons d’espai, es troben

disponibles per a aquelles persones que hi estiguin interessades. Cal dir

que aquests algoritmes s’han desenvolupat per a un ús personal i, per tant,

el disseny no és tan acurat com el dels programes comercials. Per poder

accedir als codis utilitzats en cadascun dels capítols només cal adreçar-se a

l’autor o coautors dels articles corresponents.

7.2 Línies de recerca futura

El grup de Quimiometria i Qualimetria de la Universitat Rovira i Virgili

està tractant actualment una sèrie de temes importants relacionats amb el

mètode de regressió BLS. Tot i que es tracta de temes de recerca en

desenvolupament, la descripció és convenient per comprendre la direcció

que ha de seguir la recerca futura. Per una banda, s’està desenvolupant una

tècnica de regressió robusta que considera les incerteses degudes als errors

aleatoris comesos en la mesura de les mostres de diverses concentracions.

Una altre tema de recerca consisteix a desenvolupar un procediment que

permeti la detecció de punts discrepants al voltant de la recta de regressió

BLS. D’altra banda, també s’està treballant en la determinació de límits de

decisió, detecció i quantificació pel mètode de regressió BLS. A causa de la

intensa activitat dels darrers anys del grup de Quimiometria i Qualimetria

de la Universitat Rovira i Virgili sobre la calibració lineal univariant


308

considerant les incerteses dels errors experimentals, la línia de recerca

futura se centren principalment en la regressió lineal multivariant.

Un aspecte interessant a estudiar dins la regressió lineal

multivariant considerant les incerteses dels errors experimentals en tots els

eixos és la detecció de punts discrepants entorn a l’hiperplà de regressió.

Aquest treball resulta especialment important perquè el mètode de

regressió MLS, igual que el mètode BLS, són poc robustos en presència de

punts experimentals amb incerteses molt baixes (punts d’elevada precisió).

Un altre tema de recerca futura relacionat amb el mètode de regressió

multivariant MLS consisteix en el càlcul de límits de decisió, detecció i

quantificació de forma anàloga als estudis realitzats per BLS. Una altra

possible actuació en aquest camp vindria donada per la millora de

l’algoritme de càlcul que permet estimar els coeficients del hiperplà de

regressió MLS. Tot i que el temps de convergència d’aquest algoritme és

baix en la majoria dels casos estudiats, hi ha casos molt puntuals en què el

temps de convergència és massa elevat i la convergència pot resultar

problemàtica.

Finalment, cal esmentar un altre tema de recerca dirigit al

desenvolupament d’una tècnica de regressió per estimar els coeficients de

regressió quan les mesures experimentals s’ajusten a una línia corba, que

de forma anàloga als mètodes de regressió lineal BLS i MLS, consideri les

incerteses degudes als errors comesos en la mesura de les mostres de

diverses concentracions. Aquest tipus de regressió és força emprat en

algunes àrees de la química analítica com, per exemple, en la datació per

radiocarboni de materials arqueològics mitjançant mesures per centelleig

líquid, en què la relació entre concentració i resposta se sol ajustar a un

polinomi de tercer grau.

CALIBRACIÓ LINEAL I COMPARACIÓ DE MÈTODES ANALÍTICS MITJANÇANT … · 2011. 7. 20. · 4....

Documents

Transcript of CALIBRACIÓ LINEAL I COMPARACIÓ DE MÈTODES ANALÍTICS MITJANÇANT … · 2011. 7. 20. · 4....