9人工智能导论 (7).pdf

上传人:奉*** 文档编号:4060213 上传时间:2021-01-13 格式:PDF 页数:21 大小:1.11MB
返回 下载 相关 举报
9人工智能导论 (7).pdf_第1页
第1页 / 共21页
9人工智能导论 (7).pdf_第2页
第2页 / 共21页
点击查看更多>>
资源描述

《9人工智能导论 (7).pdf》由会员分享,可在线阅读,更多相关《9人工智能导论 (7).pdf(21页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。

1、47 10.1. Classification 10.2. Regression 10.3. Clustering 10.4. Ranking 10.5. Dimensionality Reduction Contents: 10. Tasks in Machine Learning Clustering Artificial Intelligence 49 10.3.1. How Clustering Works 10.3.2. Major Approaches of Clustering 10.3.3. Applications and Algorithms Contents: 10.3.

2、 Clustering Artificial Intelligence : Learning : Tasks 50 A longer description 较长描述 Clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. 聚类是以这样的一种方式将对象进行分组的任务,即同一组中的对象彼此之间比其他组中的对象 更相似。 A shorter

3、 description 较短描述 The process of organizing objects into groups whose members are similar in some way. 将对象进行分组的过程,组内成员具有某种方式的相似性。 A very short description 极简描述 To group data objects. 将数据对象分组。 What is Clustering 什么是聚类 10.3.1. How Clustering Works Artificial Intelligence : Learning : Tasks 51 Similari

4、ty 相似性 Groups or classes Difference 差异性 As shown in the following table 如下表所示 Clustering vs. Classification 聚类与分类 10.3.1. How Clustering Works Clustering 聚类Classification 分类 To identify similar groups for input objects 给输入对象标识相似的组。 To assign pre-defined classes for input items 给输入项分派预定义的类。 Without t

5、raining data. 没有训练数据。 With training data. 有训练数据。 Clusters are discovered based on distances, density, etc. 基于距离、密度等发现类聚。 Classifiers need to have a high accuracy for classification. 分类器需要具有较高的分类精度。 Artificial Intelligence : Learning : Tasks 52 Grouping Input Data into Same Cluster 将输入数据分成相同的类聚 10.3.

6、1. How Clustering Works Clustering 聚类 Unseen 1X 2X 3X 4X Cluster Analysis 聚类分析 Input: No labeled. 输入:无标注的 Output: Grouped 输出:分组的 Data 数据 Clustering Algorithm 聚类算法 Unseen Clusters 未知类聚 Clusters 类聚 Cluster 1 1X 3X Cluster n 2X 4X Identified Clusters 标识类聚 Artificial Intelligence : Learning : Tasks 53 T

7、wo Key Steps in Clustering Procedure 聚类过程中的两个重要步骤 10.3.1. How Clustering Works Clustering Algorithm 聚类算法 Data 数据 Clusters 类聚 Cluster Validation 聚类验证 Clustering 聚类 Unseen 1X 2X 3X 4X Unseen Clusters 未知类聚 Cluster 1 1X 3X Cluster n 2X 4X Identified Clusters 标识类聚 Input: No labeled. 输入:无标注的 Output: Group

8、ed 输出:分组的 Artificial Intelligence : Learning : Tasks 54 Let n(n1) denote a set of n-dimensional real-valued vectors, input space X is a subset of n, output space Y is a set of unknown clusters, D is an unknown distribution over X Y, then: 设n(n1) 表示一个n维实数向量集,输入空间X是n的子集,输出空间Y是一组未知的类聚, D是X Y笛卡尔 乘积上的未知分

9、布,则: Let a clustering function:设聚类函数 A Formal Description of Clustering 一种聚类的形式化描述 10.3.1. How Clustering Works h X Y and h H X = x(i)| x Y, i 1, m Y = h(X) = y(i)| y Y, i 1, n, h(x) = y Clustering: 聚类 Given a testing set of unknown clusters: 给定一个未知类聚的测试集: Using the clustering function determined at

10、 above to analyze the clustering results: 采用上述确定的聚类函数来分析聚类结果: Artificial Intelligence 55 10.3.1. How Clustering Works 10.3.2. Major Approaches of Clustering 10.3.3. Applications and Algorithms Contents: 10.3. Clustering Artificial Intelligence : Learning : Tasks 56 1) Connectivity-based clustering 基

11、于连接性聚类 Also known as hierarchical clustering, based on the distance between objects. 也被称为基于对象间距离的层次聚类。 2) Centroid-based clustering 基于中心点聚类 To find the k cluster centers and assign the objects to nearest cluster center. 发现k个类聚中心并将对象分配到最近的类聚中心点。 3) Distribution-based clustering 基于分布聚类 Clusters can be

12、 defined as objects belonging most likely to the same distribution. 类聚可被定义为恰好属于同一分布的对象群。 4) Density-based clustering 基于密度聚类 To group objects into one cluster if they are connected by densely populated area. 将稠密区域连接的对象组成一个类聚。 Typical Approaches of Clustering Algorithm 聚类算法的典型方法 10.3.2. Major Approach

13、es of Clustering Artificial Intelligence : Learning : Tasks 57 Based on the core idea of objects being more related to nearby objects than to objects farther away. 基于这样一个核心理念:对象与其附近的对象更相关,而不是较远的对象。 Creating a hierarchical decomposition of the set of data objects using some criterion. 采用某种准则来创建数据对象集的

14、层次分解。 1) Connectivity-based clustering 基于连接性聚类 10.3.2. Major Approaches of Clustering Typical algorithms: AGNES (Agglomerative NESting), DIANA (Divisive Analysis), 典型算法:AGNES (集聚嵌套), DIANA (分裂分析), Artificial Intelligence : Learning : Tasks 58 Constructing various partitions and then evaluating them

15、by some criterion, e.g., minimizing the sum of square distance cost. 构建各种不同的分区,再根据某种准则(例如最小平方距离代价之和)对其进行评价。 2) Centroid-based clustering 基于中心点聚类 10.3.2. Major Approaches of Clustering Typical algorithms: k-means, k-medoids, 典型算法:k-均值, k-中心点, Artificial Intelligence : Learning : Tasks 59 Clusters are

16、 modeled using statistical distributions, such as multivariate normal distributions. 采用统计分布(诸如多元正态分布)对类聚进行建模。 3) Distribution-based clustering 基于分布聚类 10.3.2. Major Approaches of Clustering Typical algorithms: Expectation-maximization, 典型算法:期望最大化, Artificial Intelligence : Learning : Tasks 60 Cluster

17、s are defined as areas of higher density than the remainder of the data set. 类聚被定义为比数据集其余部分密度更高的区域。 4) Density-based clustering 基于密度聚类 10.3.2. Major Approaches of Clustering Typical algorithms: DBSCAN (Density-Based Spatial Clustering of Applications with Noise), 典型算法:DBSCAN (基于密度的噪声应用空间聚类), Artific

18、ial Intelligence : Learning : Tasks 61 Cluster centers are characterized by 1) a higher density than their neighbors, 2) a larger distance from points with higher densities. 类聚中心点的特性是:1) 密度高于其相邻点,2) 距离大于其它较高密度点。 The features of the clustering method are: 该聚类方法的特点: the number of clusters arises intui

19、tively, 直观地得到类聚的个数, outliers are automatically spotted and excluded, 自动地发现和排除离群点, clusters are recognized regardless of their shape, and space dimensionality. 无论其形状以及空间的维度,类聚都能被识别。 Case Study: Clustering by density peaks 根据密度峰值聚类 10.3.2. Major Approaches of Clustering Source: “Clustering by fast sea

20、rch and find of density peaks”, SCIENCE, Vol. 344, Jun. 27 2014. Artificial Intelligence : Learning : Tasks 62 Case Study: Clustering by density peaks 根据密度峰值聚类 10.3.2. Major Approaches of Clustering Decision graph calculated local density and distance 计算局部密度和距离后的决策图 density distance Data (28 points)

21、 in decreasing density. 密度降排表示的数据(28个点) Local density: 局部密度: otherwise Minimum distance: 最小距离: dc : cutoff distance. 截断距离 where, dij: the distances between data points 数据点之间的距离 Highest density 最高密度 Artificial Intelligence : Learning : Tasks 63 Clustering analysis of the Olivetti Face Database. 人脸数据库

22、Olivetti的聚类分析 Case Study: Clustering by density peaks 根据密度峰值聚类 10.3.2. Major Approaches of Clustering Pictorial representation of the cluster assignations for the first 100 images. Faces with the same color belong to the same cluster, whereas gray images are not assigned to any cluster. Cluster cent

23、ers are labeled with white circles. 前100幅图像类聚分配的图片表示。具有同样颜色的人脸属于同一个类聚,而灰色图像表示没被分配到任何类聚。类聚中心标有白色圆圈。 Artificial Intelligence 64 10.3.1. How Clustering Works 10.3.2. Major Approaches of Clustering 10.3.3. Applications and Algorithms Contents: 10.3. Clustering Artificial Intelligence : Learning : Tasks

24、65 Medicine Medical imaging Business and marketing Grouping of customers Grouping of shopping items World wide web Social network analysis Search result grouping Computer science Image segmentation Recommender systems Typical Applications of Clustering 聚类的典型应用 10.3.3. Applications and Algorithms 医学

25、医学影像 商务和营销 顾客分组 购物商品分组 万维网 社交网络分析 搜索结果分组 计算机科学 图像分割 推荐系统 Artificial Intelligence : Learning : Tasks 66 Typical Algorithms of Clustering 典型的聚类算法 10.3.3. Applications and Algorithms k-means k-modes PAM CLARA FCM BIRCH CURE ROCK Chameleon Echidna DBSCAN DBCLASD OPTICS DENCLUE Wave-Cluster CLIQUE STING OptiGrid EM CLASSIT COBWEB SOMs

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 教育专区 > 大学资料

本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知淘文阁网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

工信部备案号:黑ICP备15003705号© 2020-2023 www.taowenge.com 淘文阁