Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

k modes: selecting optimal k

$
0
0

@user2816215 wrote:

I have categorical data and I’m trying to implement k-modes using the GitHub package available here. I am trying to create clusters in my (large) dataset of say, 5-7 records, each of most similar records. I mean to add a few more restrictions to creating these clusters later. But as of now, because my data is completely categorical, I thought of implementing k-modes.

However, as of now I have no means to select the optimal ‘k’ which would result in maximum silhouette score, ideally. This would be ideal as k-modes works on dissimilarity/similarity measure as a distance. So I would assume that silhouette distance would then measure how close/far the clusters are based on the distance metric defined by this dissimilarity and thus, establish the silhouette score. I’m not able to find a correct implementation of this.

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles