7.10 kmeans visualise
A visualisation of the predictions from the model provide a good indication of the quality of the clustering. The visualise command does that.
Common usage:
General usage:
The datafile is a csv format of data with named numeric columns and a label column. A visualisation of the cluster membership for each observation is generated.
If the dataset has more than two input variables, as is the case above, then a principal components analysis (PCA) is undertaken, and the two most significant components (PC1 and PC2) are plot.
A complete pipeline to cluster, predict and then visualise the clusters.
