The other vignette focuses on reproducing a single clustering workflow that assumes that the number of clusters has been decided. As the app includes a few options for evaluating clusters, some of the functions are also made available in the package. The output of the clustering functions can also be used with other packages.
<- iris %>% select(Sepal.Length, Sepal.Width, Petal.Width) numeric_data <- compute_dmat(numeric_data, "euclidean", TRUE) dmat <- compute_clusters(dmat, "complete")clusters
For Gap statistic, the optimal number of clusters depends on the method use to compare cluster solutions. The package cluster includes the function
cluster::maxSE() to help with that.
<- compute_gapstat(scale(numeric_data), clusters) gap_results <- cluster::maxSE(gap_results$gap, gap_results$SE.sim) optimal_k line_plot(gap_results, "k", "gap", xintercept = optimal_k)
The Shiny app also includes various other measures computed by [clusterCrit::intCriteria()]. The function
compute_metric works similarly to
optimal_score is similar to maxSE. However,
optimal_score varies only between first and global minimum and maximum.
<- compute_metric(scale(numeric_data), clusters, "Dunn") res <- optimal_score(res$score) optimal_k line_plot(res, "k", "score", optimal_k)