协方差矩阵

协方差矩阵

定義 —

(

Ω

,

Σ

,

P

)

{\displaystyle (\Omega ,\,\Sigma ,\,P)}

是機率空間,

X

=

{

x

i

}

i

=

1

m

{\displaystyle X=\{x_{i}\}_{i=1}^{m}}

Y

=

{

y

i

}

j

=

1

n

{\displaystyle Y=\{y_{i}\}_{j=1}^{n}}

是定義在

Ω

{\displaystyle \Omega }

上的兩列实数随机变量序列

若二者对应的期望值分别为:

E

(

x

i

)

=

Ω

x

i

d

P

=

μ

i

{\displaystyle E(x_{i})=\int _{\Omega }x_{i}\,dP=\mu _{i}}

E

(

y

j

)

=

Ω

y

j

d

P

=

ν

j

{\displaystyle E(y_{j})=\int _{\Omega }y_{j}\,dP=\nu _{j}}

則这两列隨機变量间的协方差矩阵为:

c

o

v

(

X

,

Y

)

:=

[

cov

(

x

i

,

y

j

)

]

m

×

n

=

[

E

[

(

x

i

μ

i

)

(

y

j

ν

j

)

]

]

m

×

n

{\displaystyle \operatorname {\mathbf {cov} } (X,Y):={\left[\,\operatorname {cov} (x_{i},y_{j})\,\right]}_{m\times n}={{\bigg [}\,\operatorname {E} [(x_{i}-\mu _{i})(y_{j}-\nu _{j})]\,{\bigg ]}}_{m\times n}}

將之以矩形表示的話就是:

c

o

v

(

X

,

Y

)

=

[

cov

(

x

1

,

y

1

)

cov

(

x

1

,

y

2

)

cov

(

x

1

,

y

n

)

cov

(

x

2

,

y

1

)

cov

(

x

2

,

y

2

)

cov

(

x

2

,

y

n

)

cov

(

x

m

,

y

1

)

cov

(

x

m

,

y

2

)

cov

(

x

m

,

y

n

)

]

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)={\begin{bmatrix}\operatorname {cov} (x_{1},y_{1})&\operatorname {cov} (x_{1},y_{2})&\cdots &\operatorname {cov} (x_{1},y_{n})\\\operatorname {cov} (x_{2},y_{1})&\operatorname {cov} (x_{2},y_{2})&\cdots &\operatorname {cov} (x_{2},y_{n})\\\vdots &\vdots &\ddots &\vdots \\\operatorname {cov} (x_{m},y_{1})&\operatorname {cov} (x_{m},y_{2})&\cdots &\operatorname {cov} (x_{m},y_{n})\end{bmatrix}}}

=

[

E

[

(

x

1

μ

1

)

(

y

1

ν

1

)

]

E

[

(

x

1

μ

1

)

(

y

2

ν

2

)

]

E

[

(

x

1

μ

1

)

(

y

n

ν

n

)

]

E

[

(

x

2

μ

2

)

(

y

1

ν

1

)

]

E

[

(

x

2

μ

2

)

(

y

2

ν

2

)

]

E

[

(

x

2

μ

2

)

(

y

n

ν

n

)

]

E

[

(

x

m

μ

m

)

(

y

1

ν

1

)

]

E

[

(

x

m

μ

m

)

(

y

2

ν

2

)

]

E

[

(

x

m

μ

m

)

(

y

n

ν

n

)

]

]

{\displaystyle ={\begin{bmatrix}\mathrm {E} [(x_{1}-\mu _{1})(y_{1}-\nu _{1})]&\mathrm {E} [(x_{1}-\mu _{1})(y_{2}-\nu _{2})]&\cdots &\mathrm {E} [(x_{1}-\mu _{1})(y_{n}-\nu _{n})]\\\mathrm {E} [(x_{2}-\mu _{2})(y_{1}-\nu _{1})]&\mathrm {E} [(x_{2}-\mu _{2})(y_{2}-\nu _{2})]&\cdots &\mathrm {E} [(x_{2}-\mu _{2})(y_{n}-\nu _{n})]\\\vdots &\vdots &\ddots &\vdots \\\mathrm {E} [(x_{m}-\mu _{m})(y_{1}-\nu _{1})]&\mathrm {E} [(x_{m}-\mu _{m})(y_{2}-\nu _{2})]&\cdots &\mathrm {E} [(x_{m}-\mu _{m})(y_{n}-\nu _{n})]\end{bmatrix}}}

根據測度積分的線性性質,协方差矩阵還可以進一步化簡為:

c

o

v

(

X

,

Y

)

=

[

E

(

x

i

y

j

)

μ

i

ν

j

]

n

×

n

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)={\left[\,\operatorname {E} (x_{i}y_{j})-\mu _{i}\nu _{j}\,\right]}_{n\times n}}

矩陣表示法

编辑

以上定義所述的隨機變數序列

X

{\displaystyle X}

Y

{\displaystyle Y}

,也可分別以用行向量

X

:=

[

x

i

]

m

{\displaystyle \mathbf {X} :={\left[x_{i}\right]}_{m}}

Y

:=

[

y

j

]

n

{\displaystyle \mathbf {Y} :={\left[y_{j}\right]}_{n}}

表示,換句話說:

X

:=

[

x

1

x

2

x

m

]

{\displaystyle \mathbf {X} :={\begin{bmatrix}x_{1}\\x_{2}\\\vdots \\x_{m}\end{bmatrix}}}

Y

:=

[

y

1

y

2

y

n

]

{\displaystyle \mathbf {Y} :={\begin{bmatrix}y_{1}\\y_{2}\\\vdots \\y_{n}\end{bmatrix}}}

這樣的話,對於

m

×

n

{\displaystyle m\times n}

個定義在

Ω

{\displaystyle \Omega }

上的隨機變數

a

i

j

{\displaystyle a_{ij}}

所組成的矩陣

A

=

[

a

i

j

]

m

×

n

{\displaystyle \mathbf {A} ={\left[\,a_{ij}\,\right]}_{m\times n}}

, 定義:

E

[

A

]

:=

[

E

(

a

i

j

)

]

m

×

n

{\displaystyle \mathrm {E} [\mathbf {A} ]:={\left[\,\operatorname {E} (a_{ij})\,\right]}_{m\times n}}

也就是說

E

[

A

]

:=

[

E

(

a

11

)

E

(

a

12

)

E

(

a

1

n

)

E

(

a

21

)

E

(

a

22

)

E

(

a

2

n

)

E

(

a

m

1

)

E

(

a

m

2

)

E

(

a

m

n

)

]

{\displaystyle \mathrm {E} [\mathbf {A} ]:={\begin{bmatrix}\operatorname {E} (a_{11})&\operatorname {E} (a_{12})&\cdots &\operatorname {E} (a_{1n})\\\operatorname {E} (a_{21})&\operatorname {E} (a_{22})&\cdots &\operatorname {E} (a_{2n})\\\vdots &\vdots &\ddots &\vdots \\\operatorname {E} (a_{m1})&\operatorname {E} (a_{m2})&\cdots &\operatorname {E} (a_{mn})\end{bmatrix}}}

那上小節定義的协方差矩阵就可以記为:

c

o

v

(

X

,

Y

)

=

E

[

(

X

E

[

X

]

)

(

Y

E

[

Y

]

)

T

]

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)=\mathrm {E} \left[\left(\mathbf {X} -\mathrm {E} [\mathbf {X} ]\right)\left(\mathbf {Y} -\mathrm {E} [\mathbf {Y} ]\right)^{\rm {T}}\right]}

所以协方差矩阵也可對

X

{\displaystyle \mathbf {X} }

Y

{\displaystyle \mathbf {Y} }

來定義:

c

o

v

(

X

,

Y

)

:=

E

[

(

X

E

[

X

]

)

(

Y

E

[

Y

]

)

T

]

{\displaystyle \operatorname {\mathbf {cov} } (\mathbf {X} ,\mathbf {Y} ):=\mathrm {E} \left[\left(\mathbf {X} -\mathrm {E} [\mathbf {X} ]\right)\left(\mathbf {Y} -\mathrm {E} [\mathbf {Y} ]\right)^{\rm {T}}\right]}

术语与符号分歧

编辑

也有人把以下的

Σ

X

{\displaystyle \mathbf {\Sigma } _{X}}

稱為协方差矩阵:

Σ

X

:=

[

cov

(

x

i

,

x

j

)

]

m

×

m

=

c

o

v

(

X

,

X

)

{\displaystyle {\begin{aligned}\mathbf {\Sigma } _{X}&:={\left[\operatorname {cov} (x_{i},x_{j})\right]}_{m\times m}\\&=\operatorname {\mathbf {cov} } (X,X)\end{aligned}}}

但本頁面沿用威廉·费勒的说法,把

Σ

X

{\displaystyle \mathbf {\Sigma } _{X}}

稱為

X

{\displaystyle X}

的方差(variance of random vector),來跟

c

o

v

(

X

,

Y

)

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)}

作區別。這是因為:

cov

(

x

i

,

x

i

)

=

E

[

(

x

i

μ

i

)

2

]

=

var

(

x

i

)

{\displaystyle \operatorname {cov} (x_{i},x_{i})=\operatorname {E} [{(x_{i}-\mu _{i})}^{2}]=\operatorname {var} (x_{i})}

換句話說,

Σ

X

{\displaystyle \mathbf {\Sigma } _{X}}

的對角線由隨機變數

x

i

{\displaystyle x_{i}}

的方差所組成。據此,也有人也把

c

o

v

(

X

,

Y

)

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)}

稱為方差-协方差矩阵(variance–covariance matrix)。

更有人因為方差和离差的相關性,含混的將

c

o

v

(

X

,

Y

)

{\displaystyle \operatorname {\mathbf {cov} } (X,Y)}

稱為离差矩阵。

相关推荐