Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

MindMap Gallery K-means clustering algorithm groups unlabeled data knowledge point notes

K-means clustering algorithm groups unlabeled data knowledge point notes

Notes on knowledge points of k-means clustering algorithm for grouping unlabeled data. A picture will help you fully understand the relevant content. Mind maps can help you improve efficiency. Come and give it a try~

Edited at 2022-12-07 21:54:35

PlotWizard

Recent works View more works>>

K-means clustering algorithm groups unlabeled data knowledge point notes

PlotWizard

Recent works View more works>>

Recommended to you
Outline

K-means clustering algorithm groups unlabeled data knowledge point notes

introduction

cluster identification

Give the meaning of the clustering results

The difference between clustering and classification

The goal of classification is known in advance, but clustering is not

K-means clustering algorithm

advantage

Easy to implement

shortcoming

May converge to a local minimum

Slow to converge on large datasets

Applicable data types

Numerical type

work process

Randomly select k initial points as the centroid

Assign each point of the dataset to a cluster (find the nearest centroid)

After allocation, update the centroid of each cluster to the mean of all points

pseudocode

Create k points as initial centroids

When the cluster allocation result of any point changes

For each point in the data set

for each centroid

Calculate the distance between the centroid and the data point

Assign data points to their nearest cluster

For each cluster, calculate the mean of all points in the cluster as the new centroid

General process

Data collection

any method

Prepare data

Requires numeric data to calculate distance

Nominal data needs to be mapped to binary data

analyze data

any method

training algorithm

Unsupervised learning without training algorithm

Test algorithm

Apply clustering algorithm to observe the results

Algorithm results can be evaluated using quantitative error metrics (such as error sum of squares)

Use algorithms

Any application you want

Typically, cluster centroids can represent the entire cluster's data to make decisions

Use post-processing to improve clustering performance

Measuring clustering effect

SSE

error sum of squares

The smaller the value, the closer the data points are to the centroid and the better the clustering effect.

Post-processing

Split the cluster with maximum SSE into two clusters

Filter out the points of the largest cluster and perform k-means clustering with k=2

In order to keep the total number of clusters unchanged, two clusters can be merged

Merge nearest centroid

Calculate the distance between all centroids

Merge the two centroids that minimize the increase in SSE

Merge two clusters and calculate the total SSE

Bipartite K-means algorithm

Purpose

Solve the problem of K-means converging to local minimum

pseudocode

Treat all points as a cluster

When the number of clusters is less than k

for each cluster

Calculate total error

Perform K-means on the given clusters (k=2)

Calculate the total error after dividing the cluster into two

Select the cluster that minimizes the error for partitioning

another approach

Select the cluster with the largest SSE for partitioning

Example: Clustering points on a map

Yahoo! PlaceFinder API

Need to register to obtain API key

Other methods

"Living Data" Ch8: Geopy

Cluster geographic coordinates

Spherical cosine theorem