當前位置:妙知谷 >

遊戲數碼 >電腦 >

python實現K-means算法

python實現K-means算法

k-means 算法接受參數 k ;然後將事先輸入的n個數據對象劃分為 k個聚類以便使得所獲得的聚類滿足:同一聚類中的對象相似度較高;而不同聚類中的對象相似度較小。聚類相似度是利用各聚類中對象的均值所獲得一個“中心對象”(引力中心)來進行計算的。通過隨機選取幾個聚類中心,並計算所有點到中心的距離,選取最近的一類,在以這個簇為中心,求簇中點的均值形成新的類。

python實現K-means算法

操作方法

(01)第一步計算歐氏距離並取樣,k代表分類的總個數import numpy as np#calculate the O distancedef calculate_distance(vector1,vector2):import numpy as npreturn ((re(vector1-vector2)))#initialize centroidsdef initialize_centroids(data,k):import randomreturn le(data,k)

python實現K-means算法 第2張
python實現K-means算法 第3張

(02)產生新的簇類並求出最短距離#find the minimun diastance from individual to centroidsdef minimun_distance(data,centroidlist):clusterdictionary=cd=dict()for i in data:vector1=imarker=0min_dist=float(inf)for j in range(len(centroidlist)):vector2=centroidlist[j]distance=calculate_distance(vector1,vector2)if distance<min_dist:min_dist=distancemarker=jif marker not in ():clusterdictionary[marker]=list()clusterdictionary[marker]nd(i)return clusterdictionary#get centroidsdef getcentroids(clusterdictionary):import numpy as npcentroidlist=list()for key in ():centroid=(y(clusterdictionary[key]),axis=0)nd(centroid)return y(centroidlist)

python實現K-means算法 第4張

(03)導入數據並計算,當簇中心變化小於一定閾值跳出循環#get mean squared deviationdef getmsd(clusterdictionary,centroidlist):sum=0.0for key in ():vector1=centroidlist[key]distance=0.0for i in clusterdictionary[key]:vector2=idistance+=calculate_distance(vector1,vector2)sum+=distancereturn sum#show resultdef showresult(clusterdictionary,centroidlist):import ot as pltcolormark=[&#x27;or','ob','og','ok']centroidmark=['dr','db','dg','dk']for key in ():(centroidlist[key][0],centroidlist[key][1],centroidmark[key],markersize=12)for i in clusterdictionary[key]:(i[0],i[1],colormark[key])path='C:UsersjyjhDesktop'data=open(path,'r')lines()temp=list()import refor i in data:numlist=list()for j in p()t('t'):num=float(j)nd(num)nd(numlist)data=y(temp)centroidlist=initialize_centroids(data,4)clusterdictionary=minimun_distance(data,centroidlist)new_msd=getmsd(clusterdictionary,centroidlist)old_msd=-0.000001k=2while(abs(new_msd-old_msd)>=0.00001):centroidlist=getcentroids(clusterdictionary)clusterdictionary=minimun_distance(data,centroidlist)old_msd=new_msdnew_msd=getmsd(clusterdictionary,centroidlist)k+=1print new_msd-old_msdshowresult(clusterdictionary,centroidlist)

python實現K-means算法 第5張

特別提示

對Kmeans瞭解

matlab有kmeans函數

標籤: python 算法
  • 文章版權屬於文章作者所有,轉載請註明 https://miaozhigu.com/sm/diannao/mxg5k.html