This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
keynote:lesson04 [2010/03/31 21:11] hongxin |
keynote:lesson04 [2010/04/02 21:10] 10921020 |
||
---|---|---|---|
Line 2: | Line 2: | ||
====== 4.1 聚类 ====== | ====== 4.1 聚类 ====== | ||
===== 4.1.1 聚类的简介 ===== | ===== 4.1.1 聚类的简介 ===== | ||
- | **1.定义** | + | ====1.定义==== |
聚类是将数据点集分成由类似的点组成的多个类的过程,它是一种非监督的学习过程。\\ | 聚类是将数据点集分成由类似的点组成的多个类的过程,它是一种非监督的学习过程。\\ | ||
{{:keynote:11.jpg}} | {{:keynote:11.jpg}} | ||
- | **2.聚类与分类的区别** | + | ====2.聚类与分类的区别==== |
* **聚类** | * **聚类** | ||
* 实例: $\{ {\mathbf{x}}\}_{i=1}^N $ | * 实例: $\{ {\mathbf{x}}\}_{i=1}^N $ | ||
Line 17: | Line 17: | ||
<note tip> 聚类所要求划分的类是未知的。</note> | <note tip> 聚类所要求划分的类是未知的。</note> | ||
| | ||
- | **3.聚类的应用** | + | ====3.聚类的应用==== |
聚类方法可以运用于图像分割领域, 下图就是两个运用Mean-shift方法作图像分割的例子。\\ | 聚类方法可以运用于图像分割领域, 下图就是两个运用Mean-shift方法作图像分割的例子。\\ | ||
{{:keynote:14.jpg}} \\ | {{:keynote:14.jpg}} \\ | ||
Line 23: | Line 23: | ||
===== 4.1.2 几种聚类方法 ===== | ===== 4.1.2 几种聚类方法 ===== | ||
- | **1.层次(从顶到下)聚类** | + | ==== 1.层次(从顶到下)聚类 ==== |
* **__思想__:** 顺序地将最近的两个点/类合并; | * **__思想__:** 顺序地将最近的两个点/类合并; | ||
* **__具体过程__** | * **__具体过程__** | ||
Line 33: | Line 33: | ||
- 单点距离: \[ d_{kl} = min_{x_i \in C_k, x_j \in C_l} d(x_i, d_j)\] | - 单点距离: \[ d_{kl} = min_{x_i \in C_k, x_j \in C_l} d(x_i, d_j)\] | ||
- 平均点距离:\[ d_{kl} = \frac{1}{ |C_k| + |C_l| } \sum_{x_i \in C_k, x_j \in C_l} {d(x_i, d_j)} \] | - 平均点距离:\[ d_{kl} = \frac{1}{ |C_k| + |C_l| } \sum_{x_i \in C_k, x_j \in C_l} {d(x_i, d_j)} \] | ||
- | - 质心距离: \[ d_{kl} = d(c_l, c_k), c_l = \frac{1}{ |C_k| } \sum_{x_i \in C_l} {x_i} \] | + | - 质心距离: \[ d_{kl} = d(c_l, c_k), c_k = \frac{1}{ |C_k| } \sum_{x_i \in C_k} {x_i} \] |
* **__层次聚类的树形图表示__** | * **__层次聚类的树形图表示__** | ||
Line 40: | Line 40: | ||
* 其中,每一对数据点之间的高度为两个类之间的距离。 | * 其中,每一对数据点之间的高度为两个类之间的距离。 | ||
- | **2.频谱聚类** | + | ==== 2.频谱聚类 ==== |
* **__三要素__** | * **__三要素__** | ||
- 最近点邻接图{{:keynote:22.jpg|}}; | - 最近点邻接图{{:keynote:22.jpg|}}; | ||
Line 57: | Line 57: | ||
* **__频谱聚类的实例__** | * **__频谱聚类的实例__** | ||
* {{:keynote:29.jpg|}} | * {{:keynote:29.jpg|}} | ||
- | * **3.Flat聚类方法** | + | |
- | - **K-means聚类** | + | ==== 3.Flat聚类方法 ==== |
+ | === 1.K-means聚类 === | ||
* **__思想__** | * **__思想__** | ||
* {{:keynote:30.jpg|Unordered List Item}} | * {{:keynote:30.jpg|Unordered List Item}} | ||
* **__具体过程__** | * **__具体过程__** | ||
* {{:keynote:31.jpg|Unordered List Item}} | * {{:keynote:31.jpg|Unordered List Item}} | ||
- | - **混合高斯聚类** | + | === 2.混合高斯聚类 === |
<note important> Revised by Zhang Xiaoli(张晓丽), <zhangxiaoli@zjucadcg.cn> </note> | <note important> Revised by Zhang Xiaoli(张晓丽), <zhangxiaoli@zjucadcg.cn> </note> | ||
Line 232: | Line 233: | ||
**Viterbi 算法**(N是状态数目,T是序列长度)\\ | **Viterbi 算法**(N是状态数目,T是序列长度)\\ | ||
- | 初始化$\delta_1(i)=a_{oi}b_i(O_1)$ $\psi_1(i)=0$\\ | + | 初始化: |
- | 递归$\delta_t(j)=max_i[\delta_{i-1}(i)a_{ij}]b_j(O_t) \qquad t=2,…T \qquad j=1…N $ | + | \[ \delta_1(i)=a_{oi}b_i(O_1), \psi_1(i)=0 \] |
- | $\qquad\psi_1(j)=argmax_i[\delta_{i-1}(i)a_{ij}] \qquad t=2,…T \qquad j=1…N $\\ | + | 递归: |
- | 终止$P^*=max_i\delta_T(i)$ ${q_T}^*=argmax_i[\delta_T(i)]$\\ | + | \[\delta_t(j)=max_i[\delta_{i-1}(i)a_{ij}]b_j(O_t) \qquad t=2,…T \qquad j=1…N \] |
- | 回溯求S序列${q_T}^*=\psi_1({q^*}_{t+1}) \qquad t=T-1,T-2,...,1$\\ | + | \[\qquad\psi_1(j)=argmax_i[\delta_{i-1}(i)a_{ij}] \qquad t=2,…T \qquad j=1…N \] |
+ | |||
+ | 终止: | ||
+ | \[ P^*=max_i\delta_T(i)$ ${q_T}^*=argmax_i[\delta_T(i)] \] | ||
+ | 回溯求$S$序列: | ||
+ | \[ {q_T}^*=\psi_1({q^*}_{t+1}) \qquad t=T-1,T-2,...,1 \] | ||
算法示意图如下所示:\\ | 算法示意图如下所示:\\ |