MindMap Gallery K-means clustering algorithm groups unlabeled data knowledge point notes
Notes on knowledge points of k-means clustering algorithm for grouping unlabeled data. A picture will help you fully understand the relevant content. Mind maps can help you improve efficiency. Come and give it a try~
Edited at 2022-12-07 21:54:35Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
K-means clustering algorithm groups unlabeled data knowledge point notes
introduction
cluster identification
Give the meaning of the clustering results
The difference between clustering and classification
The goal of classification is known in advance, but clustering is not
K-means clustering algorithm
advantage
Easy to implement
shortcoming
May converge to a local minimum
Slow to converge on large datasets
Applicable data types
Numerical type
work process
Randomly select k initial points as the centroid
Assign each point of the dataset to a cluster (find the nearest centroid)
After allocation, update the centroid of each cluster to the mean of all points
pseudocode
Create k points as initial centroids
When the cluster allocation result of any point changes
For each point in the data set
for each centroid
Calculate the distance between the centroid and the data point
Assign data points to their nearest cluster
For each cluster, calculate the mean of all points in the cluster as the new centroid
General process
Data collection
any method
Prepare data
Requires numeric data to calculate distance
Nominal data needs to be mapped to binary data
analyze data
any method
training algorithm
Unsupervised learning without training algorithm
Test algorithm
Apply clustering algorithm to observe the results
Algorithm results can be evaluated using quantitative error metrics (such as error sum of squares)
Use algorithms
Any application you want
Typically, cluster centroids can represent the entire cluster's data to make decisions
Use post-processing to improve clustering performance
Measuring clustering effect
SSE
error sum of squares
The smaller the value, the closer the data points are to the centroid and the better the clustering effect.
Post-processing
Split the cluster with maximum SSE into two clusters
Filter out the points of the largest cluster and perform k-means clustering with k=2
In order to keep the total number of clusters unchanged, two clusters can be merged
Merge nearest centroid
Calculate the distance between all centroids
Merge the two centroids that minimize the increase in SSE
Merge two clusters and calculate the total SSE
Bipartite K-means algorithm
Purpose
Solve the problem of K-means converging to local minimum
pseudocode
Treat all points as a cluster
When the number of clusters is less than k
for each cluster
Calculate total error
Perform K-means on the given clusters (k=2)
Calculate the total error after dividing the cluster into two
Select the cluster that minimizes the error for partitioning
another approach
Select the cluster with the largest SSE for partitioning
Example: Clustering points on a map
Yahoo! PlaceFinder API
Need to register to obtain API key
Other methods
"Living Data" Ch8: Geopy
Cluster geographic coordinates
Spherical cosine theorem