http://scikit-learn.org/stable/modules/clustering.html#k-means

http://my.oschina.net/u/175377/blog/84420

K-Means clustering参数说明:

http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans

class sklearn.cluster.KMeans(n_clusters=8init='k-means++'n_init=10max_iter=300tol=0.0001,precompute_distances='auto'verbose=0random_state=Nonecopy_x=Truen_jobs=1)

n_clusters : int, optional, default: 8

The number of clusters to form as well as the number of centroids to generate.

集群形成的数量以及质心产生的数量。

max_iter : int, default: 300

Maximum number of iterations of the k-means algorithm for a single run.

k-means算法的一个单一运行的最大迭代数。

n_init : int, default: 10

Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

不同质心的种子的k - means算法将运行的次数。最终结果将是n_init次连续运行的最好的输出。

init : {‘k-means++’, ‘random’ or an ndarray}

Method for initialization, defaults to ‘k-means++’:

初始化的方法,默认为“k - means + +”:

‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. See section Notes in k_init for more details.“k - means + +”:用优化的方式来加速收敛,以选择k-mean初始集群中心。

‘random’: choose k observations (rows) at random from data for the initial centroids.

‘random’:从数据中随机的选择k个观测值作为初始的聚类中心。

If an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

如果一个n胃数组传递,它的形状应该是(n_clusters n_features),并给出初始中心。

precompute_distances : {‘auto’, True, False}

Precompute distances (faster but takes more memory).

预计算的距离(更快,但需要更多的内存)。

‘auto’ : do not precompute distances if n_samples * n_clusters > 12 million. This corresponds to about 100MB overhead per job using double precision.

‘auto’:当n_samples * n_clusters > 1200万时,不要预先计算距离。这对应于使用双精度数据会带来平均大约100 mb的开销。

True : always precompute distances

False : never precompute distances

tol : float, default: 1e-4

Relative tolerance with regards to inertia to declare convergence

对于精度的惯性收敛

n_jobs : int

The number of jobs to use for the computation. This works by computing each of the n_init runs in parallel.用于计算的工作量。这是通过计算每个n_init并行运行。

If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

random_state : integer or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

verbose : int, default 0

Verbosity mode.

copy_x : boolean, default True

When pre-computing distances it is more numerically accurate to center the data first. If copy_x is True, then the original data is not modified. If False, the original data is modified, and put back before the function returns, but small numerical differences may be introduced by subtracting and then adding the data mean.

cluster_centers_ : array, [n_clusters, n_features]

Coordinates of cluster centers

labels_ : :

Labels of each point

inertia_ : float

Sum of distances of samples to their closest cluster center.

 

最新文章

  1. jQuery最基础最全面的选择器大览
  2. JVM 1.6 GC
  3. c#-冒泡排序-算法
  4. android APK更新
  5. test spring in category
  6. Ninject之旅之十:Ninject自定义提供者
  7. [转载]localhost与127.0.0.1的区别
  8. Ext.Ajax中scope的作用
  9. apache环境下配置服务器支持https
  10. 【制作镜像Win*】系统配置
  11. 【原创教程】一、Angular教程系列之认识angular
  12. Qt 内存泄漏测试
  13. Node.js笔记3
  14. PHP - 日期与时间
  15. 201521123078 《java程序设计》第十周学习总结
  16. Excel VBA(宏):添加宏
  17. μC/OS-II 的系统时钟
  18. 在没有 Emacs 的情况下使用 Org 模式
  19. mysql插入报主键冲突,解决方法主键索引重新排序
  20. java编程IO简单回顾和学习

热门文章

  1. 【Android自学日记】五大布局常用属性
  2. Java Native Interface 五 JNI里的多线程与JNI方法的注册
  3. Sicily 1215: 脱离地牢(BFS)
  4. .NET WEB项目的调试发布相关
  5. iOS CoreData 中 objectID 的不变性
  6. Appium 三种wait方法(appium 学习之改造轮子)
  7. 进程管理supervisor的简单说明
  8. Java 基础高级2 网络编程
  9. EasyuI comboxTree 使用笔记
  10. 【转】Linux makefile 教程 非常详细,且易懂