-AP&LLE · 无监督学习中的选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

无监督学习中的选代表和被代表问题

- AP & LLE张响亮

Xiangliang Zhang

King Abdullah University of Science and Technology

CNCC, Oct 25, 2018

Hangzhou, China

Outline

2

• Affinity Propagation (AP)[Frey and Dueck, Science, 2007]

• Locally Linear Embedding (LLE)[Roweis and Saul, Science, 2000]

Affinity Propagation

3

[Frey and Dueck, Science 2007]

Affinity Propagation

4

[Frey and Dueck, NIPS 2005]

We describe a new method that, for the first time to our knowledge, combines the advantages of model-based clustering and affinity-based clustering.

Component

Mixingcoefficient

Clustering: group the similar points together

21 )( miCx

km x

miµ-SS Î=

iCxm

m xC miÎ

S=||

1µ

Minimize

where

21 )( miCx

km x

miµ-SS Î=

}|{ miim Cxx Î=µ

Minimize

where

K-medoidsK-medians

Inspired by greedy k-medoids

Data to cluster:

The likelihood of belong to the cluster with center

is the Bayesian prior probability that is a cluster center

The responsibility of k-thcomponent generating xi

Assign xi with center si

Choose a new center

Understanding the processMessage sent from xi to eachcenter/exemplar: preference tobe with each exemplar

Hard decision for cluster centers/exemplars

Introduce: “availabilities”, which is sent from exemplars to other points, and provides soft evidence of the preference for each exemplar to be available as a center for each point

The method presented in NIPS‘05

8

• Responsibilities are computed using likelihoods and availabilities• Availabilities are computed using responsibilities, recursively

Affinities

Interpretation by Factor Graph

9

is the index of the exemplar for

Objective function is

Constraints:

Should not be emptywith a single exemplar

An exemplar mustselect itself as exemplar

Input and Output of AP in Science07

10

Preference (prior)

AP: a message passing algorithm

11

Iterations of Message passing in AP

12


13


14


15


16


17


18


19

Summary AP

Xiangliang Zhang, KAUST CS340: Data Mining 20

Extensive study of AP

21

Outline

22

• Affinity Propagation (AP)[Frey and Dueck, Science, 2007]

• Locally Linear Embedding (LLE)[Roweis and Saul, Science, 2000]

Locally Linear Embedding (LLE)

23

[Roweis and Saul, Science, 2000]

Saul and Roweis. Think globally, fit locally: unsupervised learning of low dimensional manifolds. JMLR 2003

LLE - motivations

24

Inspired by MDS

25

Multidimensional Scaling (MDS), find embedding of objects in low-dim space, preserving pairwise distance

Given pairwise similarity Embedding to find

Eliminate the need to estimate pairwise distancesbetween widely separated data points?

LLE – general idea

� Locally, on a fine enough scale, everything looks linear

� Represent object as linear combination of its neighbors

� Assumption: same linear representation will hold in the low dimensional space

� Find a low dimensional embedding which minimizes reconstruction loss

26

LLE – matrix representation

1.Select k nearest neighbors

2.Reconstruct xi by its K nearest neighbors

27

2

i jjiji ||xwx|| (W) å å-=e

Find W by minimizing

LLE – matrix representation

3.Need to solve system Y = W*Y

28

Find the embedding vectors Yi by minimizing:

)econvariancunit with ( IYYN1 and

origin) on the (centered 0Y s.t.

W)(IW)(IM where

)Y(YM||YWY|| (Y)

N

ii

Ti

N

ii

T

N

i

N

jjiij

2

i jjiji

å =

å =

--=

åå •=å å-=e

LLE – algorithm summary

1. Find k nearest neighbors in X space

2. Solve for reconstruction weights W

3. Compute embedding coordinates Y using weights W:

� Create a sparse matrix M = (I-W)T(I-W)

� Set Y to be the eigenvectors corresponding to the bottom dnon-zero eigenvectors of M

29

neighborsnearest K s x'of one is and ),x()x(C whereC

C w jkT

jjk1-T

-1hhh --==

111

Continuing, SNE

30

Allows “many-to-one” mappings in which a single ambiguous object really belongs in several disparate locations in the low-dimensional space, while LLE makes one-to-one mapping.

pj|i is the asymmetric probability that i would pick j as its neighbor

Gaussian Neighborhood in original space

qj|i is induced probability that point i picks point j as its neighbor

Gaussian Neighborhood in low-dim space

Continuing, t-SNE

� uses a Student-t distribution (heavier tail) rather than a Gaussian to compute the similarity between two points in the low-dimensional space

� symmetrized version of the SNE cost functionwith simpler gradients

31

LLE - Follow up work

LLE output strongly depends on selection of k

Jing Chen and Yang Liu. Locally linear embedding: a survey.Artificial Intelligence Review (2011)

32

[Chen and Liu, 2011]

[Ting and Jordan, 2018]

Thank you for your attention!

Lab of Machine Intelligence and kNowledge Engineering (MINE): http://mine.kaust.edu.sa/

-AP&LLE · 无监督学习中的选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

Documents

Transcript of -AP&LLE · 无监督学习中的选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

-AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

Documents

Transcript of -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

-AP&LLE · 无监督学习中的选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...

Transcript of -AP&LLE · 无监督学习中的选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King...