MindMap Gallery Machine learning algorithm linear regression decision tree notes self-study mind map
Machine learning algorithm linear regression decision number notes self-study complete sharing! The content covers K-nearest neighbor algorithm, linear regression, logistic regression, decision tree, ensemble learning and clustering.
Edited at 2023-02-25 09:44:36This is a panoramic infographic—currently sweeping across the web—illustrating the comprehensive applications of OpenClaw, a popular open-source AI agent platform. It systematically introduces this intelligent agent framework—affectionately dubbed "Lobster Farming"—helping readers quickly grasp its core value, technical features, application scenarios, and security protocols. It serves as an excellent introductory guide and practical manual.
這是一張最近風靡全網關於熱門開源AI代理平台OpenClaw的全網應用全景圖解。它系統性地介紹了這款被稱為「養龍蝦」的智慧體框架,幫助讀者快速理解其核心價值、技術特性、應用場景及安全規範,是一份極佳的入門指南與實操手冊。此圖主要針對希望利用AI建構自動化工作流程的技術從業人員、中小企業主及效率追求者,透過9大模組層層遞進,全面剖析了OpenClaw從概念到落地的整個過程。 圖中核心內容首先釐清了「養龍蝦」指涉的是OpenClawd開源智能體,並強調其本質是「AI基建」而非一般聊天機器人。隨後詳細比較其與傳統AI助理的區別,擁有記憶管理、權限控制、會話隔離和異常恢復四大基礎能力,支援跨平台存取和多模型相容(如GPT、Claude、Ollama)。同時,圖解提供了完整的部署方案(雲端/本地/Docker),並列舉了辦公室自動化、內容創作、資料收集等五大應用程式場景。此外,還展示了其火爆程度、政府與大廠佈局、安全部署建議及適合/不適合的人群分類。幫助你快速掌握OpenClaw技術架構與應用價值,指導個人或企業建構AI自動化系統,規避資料外洩與權限失控風險,是學習「執行式AI」轉型的權威參考圖譜。
本圖由萬興腦圖繪製,是針對IT研發崗位的結構化個人履歷模板,完整涵蓋求職核心資訊模組。基本資訊區包含姓名、電話、信箱、求職意願及GitHub連結;專業概要要求以2-3句提煉核心優勢;工作經驗以「公司A高級Java開發工程師」為例,以「透過(行動),達成(量化成果)」格式呈現微服務架構設計、系統效能優化、團隊技術規範制定等職責,公司B經歷則聚焦功能模組開發與Elasticsearch搜尋優化;技能專長分程式語言、後端框架、中介軟體、資料庫、容器雲等維度,清楚展示技術堆疊;專案成果以「電商平台秒殺系統」為例,說明技術棧、架構設計、個人貢獻(Redis Lua庫存原子扣減)及KPI;教育背景包含一流大學電腦專業學歷,以及AWS認證解決方案架構師、軟考中級軟體設計師證書。模板邏輯嚴謹,涵蓋IT研發求職全流程關鍵訊息,幫助求職者清晰、量化展示專業能力。
This is a panoramic infographic—currently sweeping across the web—illustrating the comprehensive applications of OpenClaw, a popular open-source AI agent platform. It systematically introduces this intelligent agent framework—affectionately dubbed "Lobster Farming"—helping readers quickly grasp its core value, technical features, application scenarios, and security protocols. It serves as an excellent introductory guide and practical manual.
這是一張最近風靡全網關於熱門開源AI代理平台OpenClaw的全網應用全景圖解。它系統性地介紹了這款被稱為「養龍蝦」的智慧體框架,幫助讀者快速理解其核心價值、技術特性、應用場景及安全規範,是一份極佳的入門指南與實操手冊。此圖主要針對希望利用AI建構自動化工作流程的技術從業人員、中小企業主及效率追求者,透過9大模組層層遞進,全面剖析了OpenClaw從概念到落地的整個過程。 圖中核心內容首先釐清了「養龍蝦」指涉的是OpenClawd開源智能體,並強調其本質是「AI基建」而非一般聊天機器人。隨後詳細比較其與傳統AI助理的區別,擁有記憶管理、權限控制、會話隔離和異常恢復四大基礎能力,支援跨平台存取和多模型相容(如GPT、Claude、Ollama)。同時,圖解提供了完整的部署方案(雲端/本地/Docker),並列舉了辦公室自動化、內容創作、資料收集等五大應用程式場景。此外,還展示了其火爆程度、政府與大廠佈局、安全部署建議及適合/不適合的人群分類。幫助你快速掌握OpenClaw技術架構與應用價值,指導個人或企業建構AI自動化系統,規避資料外洩與權限失控風險,是學習「執行式AI」轉型的權威參考圖譜。
本圖由萬興腦圖繪製,是針對IT研發崗位的結構化個人履歷模板,完整涵蓋求職核心資訊模組。基本資訊區包含姓名、電話、信箱、求職意願及GitHub連結;專業概要要求以2-3句提煉核心優勢;工作經驗以「公司A高級Java開發工程師」為例,以「透過(行動),達成(量化成果)」格式呈現微服務架構設計、系統效能優化、團隊技術規範制定等職責,公司B經歷則聚焦功能模組開發與Elasticsearch搜尋優化;技能專長分程式語言、後端框架、中介軟體、資料庫、容器雲等維度,清楚展示技術堆疊;專案成果以「電商平台秒殺系統」為例,說明技術棧、架構設計、個人貢獻(Redis Lua庫存原子扣減)及KPI;教育背景包含一流大學電腦專業學歷,以及AWS認證解決方案架構師、軟考中級軟體設計師證書。模板邏輯嚴謹,涵蓋IT研發求職全流程關鍵訊息,幫助求職者清晰、量化展示專業能力。
Machine learning algorithm linear regression decision tree notes self-study mind map
K-nearest neighbor algorithm
Algorithm principles and API
The concept and principle of K-nearest neighbor algorithm
concept
If most of the k most similar (i.e. nearest neighbors in feature space) samples of a sample belong to a certain category, then the sample also belongs to this category.
feature space
A space established using all features as coordinate axes. In the feature space, each sample is a point.
[Key points] Principle and implementation steps
1. Calculate the distance between the point in the known category data set and the current point
2. Arrange these distances in ascending order
3. Select the k points with the smallest distance from the current point
4. Count the frequency of occurrence of the category where the first k points are located (find the mode of the K categories)
5. Use the mode category as the category of the current point
K-Nearest Neighbor Algorithm API
Introduction to Scikit-learn
Machine learning tools that implement numerous machine learning algorithms
API
sklearn.neighbors.KNeighborsclassfier(n_neighbors)
Algorithm calculation and optimization
measure of distance
Euclidean distance
Manhattan distance (block distance)
Chebyshev distance
Minkowski distance (Min's distance)
Normalized Euclidean distance
cosine distance
Hamming distance
String lengths are equal
Calculate distance of string
Jaccard distance
distance between two sets
Mahalanobis distance
The impact of the choice of K value on the model
Performance of underfitting and overfitting
approximation error
The error of the model on the training set
estimation error
The error of the model on the test set
Underfitting
The approximation error is large and the estimation error is large. To put it bluntly, the model performs poorly on both the training set and the test set.
overfitting
The approximation error is small, but the estimation error is large. To put it bluntly, the model performs well on the training set but not on the test set.
K value is too small
Largely affected by outliers
The model is complex, has strong learning ability, and is prone to overfitting.
K value is too large
The impact of sample imbalance is large
The model is simple, has weak learning ability, and is prone to underfitting.
kd tree
kd tree introduction
tree concept
root node
Nodes with only child nodes and no parent nodes
internal node
Nodes that have child nodes and parent nodes
leaf node
Nodes with only parent nodes and no child nodes
Binary tree
A tree with at most two forked nodes
The role of kd tree
Reduce the number of calculations of sample distance and quickly find the nearest neighbor point
Construction method of kd tree
1. Randomly select a feature, take the median of this feature as the dividing point to divide the data into two equal parts. The selected feature is the dividing feature of the current node, and the median point is used as the current node. On this feature, points smaller than the median are classified into the left node, and points larger than the median are classified into the right node;
2. Repeat the first step for the data on the left node and the right node respectively;
3. Until all samples are placed on the node
How to find the nearest point in kd tree
1. Compare the query point M with the dividing characteristics and corresponding median of each node on the kd tree, and continue to compare downward until reaching the leaf node. Record in order the nodes traveled through the entire process search_path;
2. The node at the end of search_path is N, do not take out N, record the distance between dist=M and N, nearest=N;
3. Take out a node L from the end of search_path, the dividing axis of this node is x, calculate the distance a from point M to the x axis, and then handle it in two cases:
If a<dist, then all points in the space on the other side divided by node L and the current nearest point are included in the investigation range, the distances from all points in the investigation range to point M are calculated, the nearest point is found, and the point is recorded as nearest. The distance from this point to M is dist, discard node L, and the step ends;
If a>=dist, discard node L and the step ends
4. Repeat step 3 until search_path is empty;
5. Nearest is the point closest to the search point, and the nearest distance is dist.
Case 1
Introduction to the dataset API of scikit-learn
sklearn small data set
API: load_* Such as: load_iris()
sklearn big data set
API: fetch_* Such as: fetch_20newsgroups(sub_set='train')
Parameter description: sub_set='train' specifies the type of data set to be obtained
Introduction to sklearn data set return value
Data type: datasets.base.Bunch (dictionary format)
data: feature data array
target: array of labels (target values)
feature_names: feature names
target_names: tag names
keys() gets all attributes (fields) of the dictionary
Draw a scatter plot of data and find outliers
sns.lmplot(col1, col2, data, hue, fit_reg)
Partition of the data set
x_train, x_test, y_train, y_test = trian_test_split(x, y, test_size)
Feature preprocessing using sklearn
Normalized
Disadvantages of normalization: greatly affected by outliers
API: MinMaxScaler(feature_range)
Function: Convert data to between 0 and 1
standardization
API:StandarScalar()
Function: Convert data to mean=0, std=1
Iris flower species prediction
Case 2
Cross-validation and grid search
Cross-validation
Divide the training set evenly into N parts, take a different part as the verification set, and the other part as the training set, train and verify the model performance, and use the average of the N times of model performance as the performance of the model on this training set.
grid search
Find the optimal combination of hyperparameters
API
sklearn.model_selection.GridSearchCV(estimator, param_grid, cv)
Predict Facebook check-in location
linear regression
Introduction to Linear Regression
Mathematical formula (mathematical model) of linear regression
h(w) = w1*x1 w2*x2 w3*x3 ... wn*xn b
The concept of hyperplane
The n-1 dimensional linear relationship in n-dimensional space is called a hyperplane in n-dimensional space.
API
sklearn.linear_model.LinearRegression()
Losses and Optimization for Linear Regression
loss function
1. The loss function is a function of the trainable parameters
2. The smaller the value of the loss function, the closer the model’s predicted value is to the true value.
Optimization method of linear regression
normal equation
Calculate optimal parameters directly
NOTE: Applicable only to linear regression models with least squares loss
gradient descent method
Continuously iterate through gradients to find optimal trainable parameters
General optimization methods
Loss function Gradient descent is the most common model optimization method
gradient descent method
Full Gradient Descent Algorithm (FG)
Stochastic Gradient Descent Algorithm (SG)
Stochastic Average Gradient Descent Algorithm (SAG)
Mini-batch gradient descent algorithm (mini-bantch)
Case-Boston House Price Forecast
Regression performance evaluation API: sklearn.metrics.mean_squared_error(y_true, y_pred)
Normal equation optimization linear regression API: sklearn.linear_model.LinearRegression()
Optimizing linear regression with stochastic gradient descent: sklearn.linear_model.SGDRegressor()
Overfitting and underfitting
Underfitting
Definition: The model performs poorly on both the training set and the test set
Solution: Increase model complexity
Increase the number of features of the data
Add polynomial terms
overfitting
Definition: The model performs well on the training set, but does not perform well on the test set
Solution: Reduce model complexity
Clean the data again
Increase the amount of training data
Regularization
L1 regularization: You can make some of the W values directly 0, removing the influence of this feature. Can be used for feature selection
L2 regularization: It can make some of the W's very small and close to 0, weakening the influence of a certain feature.
Reduce the number of features
regularized linear model
ridge regression
Linear regression L2 regularization
Lasso returns
Linear regression L1 regularization
elastic network
Linear regression L1 L2
ridge regression
Linear regression with L2 regularization
API: Ridge(alpha)
Saving and loading models
Save: joblib.dump(estimator, path)
Loading: joblib.load(path)
logistic regression
The principles of logistic regression
Mathematical model: linear regression activation function (sigmoid)
The role of the activation function: increase the nonlinear fitting ability of the model
Loss function: Log-likelihood loss
Optimization method: gradient descent
API: sklearn.linear_model.LogisticRegression()
How to evaluate classification models
Classification evaluation report API: classification_report(y_true, y_pred, label, target_names)
ROC curve
TPR = TP / (TP FN)
FPR = FP / (FP TN)
Adjust the threshold to obtain multiple (FPR, TPR) points and draw the ROC curve
AUC indicator
Meaning: randomly select a pair of positive and negative samples, the probability that the score of the positive sample is greater than that of the negative sample
API: roc_auc_score(y_true, y_score), note: y_true must use 0, 1 to mark false cases and positive cases
decision tree
Introduction to decision tree algorithm
Decision tree is a tree structure
Each internal node represents the judgment of a feature
Each branch represents the output of a judgment result
Each leaf node represents a classification result
The principle of decision tree
Feature selection and division basis of decision tree nodes
entropy
A measure of “chaos”
entropy = -p1logp1 - p2logp2 ... pn*log(pn)
information gain
The difference in entropy before and after dividing the data set by a certain feature
Information gain = entroy(before) - entroy(after)
The greater the information gain, the better the classification method of this feature.
information gain rate
Information gain/separated information metric
Separation information measure = entropy calculated from the probability of occurrence of each category of a feature
Gini gain
Gini value: The probability that two samples randomly selected from the data set D have inconsistent target values (labels)
The smaller the Gini value, the higher the purity of the data set.
Gini gain = Gini value (before) - Gini value (after)
The greater the Gini value gain, the better this division method is.
Steps to build a decision tree
1. Start looking at all samples as a whole
2. Traverse each segmentation method of each feature and find the best segmentation (based on information gain, information gain rate, and Gini value gain)
3. According to the optimal division method, divide all samples into two parts N1 and N2, that is, two branches
4. Continue steps 2-3 for N1 and N2 until each node is "pure" enough.
According to different node selection and division basis, the decision tree is divided into
ID3 decision tree: information gain
C4.5 Decision Tree: Information Gain Rate
CART decision tree: Gini value gain (or Gini index)
cart pruning
Purpose: Reduce the number of decision tree nodes -> Reduce the complexity of the decision tree -> Prevent the decision tree from overfitting
method
Pre-pruning: Pruning when creating a decision tree
The minimum number of samples contained in each node, such as 10. If the total number of samples at the node is less than 10, no classification will be performed.
Specify the height or depth of the tree, for example, the maximum depth of the tree is 4
If the entropy of the specified node is less than a certain value, it will no longer be divided.
Post-pruning: pruning after creating the decision tree
The method is similar to pre-pruning
Feature Engineering: Feature Extraction
Dictionary feature extraction
sklearn.feature_extraction.DictVectorizer(sparse=True)
Note: This method will automatically one_hot encode discrete data
Text feature extraction
word occurrence count
sklearn.feature_extraction.text.CountVectorizer(stop_words=[])
Tf-idf text feature extraction
TF: word frequency
Refers to the frequency of a given word appearing in the document
IDF: inverse document frequency
Divide the total number of documents by the number of documents containing the word, and then take the base 10 logarithm of the quotient to get
TF-IDF
TF-IDF = TF*IDF
Function: Evaluate the importance of a word to a document set or a document in a corpus
API:sklearn.feature_extraction.text.TfidfVectorizer(stop_works)
Note: To extract features from Chinese articles, word segmentation must be performed first.
Stuttering participle: jieba.cut
Case: Titanic passenger survival prediction
Decision tree API: sklearn.tree.DecisionTreeClassifier(criterion=’gini’, max_depth=None,random_state=None)
Decision tree visualization: sklearn.tree.export_graphviz()
Summarize
Advantages of decision trees: simple, interpretable, and visualizable
Disadvantages of decision trees: easy to overfit
Solution
pruning
random forest
Ensemble learning
Introduction to ensemble learning algorithms
Generate multiple classifiers/models, each learning and making predictions independently. These predictions are ultimately combined into a combined prediction that is better than any single classification prediction.
boosting -> underfitting
bagging -> overfitting
Bagging ensemble learning
Principle: Build multiple models in parallel, and the models are independent of each other
1. Sampling: Sampling N pieces of data with random replacement
2. Learning: Use N pieces of data to learn N different models
3. Integration: Through equal voting of N models, the one with the most votes will be the final result.
random forest
What is random forest: bagging decision tree
API:sklearn.ensemble.RandomForestClassifier(n_estimators, max_depth)
advantage
Both methods can improve the generalization accuracy rate by about 2% on the original algorithm.
Simple, convenient and versatile
Boosting integration principle: Build multiple models in series, and the model built later is affected by the model built previously.
1. Initialize the training weights, make the weights equal, and train the first learner (model)
2. Calculate the error rate of the learner in the training data
3. Calculate the weight of the learner based on the error rate
4. Reweight the training data according to the weight of the learner
5. Repeat steps 1~4 m times
6. Weighted voting on the results of m models to obtain the final result
clustering
Introduction to clustering algorithms
Clustering algorithm is a typical unsupervised learning algorithm, mainly used to automatically classify similar samples into a category
The biggest difference between clustering algorithms and classification algorithms: clustering algorithms are unsupervised learning algorithms, and classification algorithms are supervised learning algorithms.
Initial use of clustering algorithm API
sklearn.cluster.KMeans(n_clusters=8)
Prediction method: Call fit_predict(X) to get the classification results
Clustering algorithm implementation process (principle)
1. Randomly select K samples as the centers of K categories;
2. Calculate the distance from all samples to the center;
3. Which category the sample is closest to the center;
4. After dividing K categories, recalculate the coordinates of the center. The calculation method is that the average value of each feature value of the samples in the category is used as the corresponding coordinates of the new center;
5. Repeat steps 2, 3, and 4 until the coordinates of the center no longer change.
Model evaluation for clustering
Sum of squared errors: the sum of the squared distances (Euclidean distance) of all samples to the corresponding category center
"Elbow" method: when the decline rate suddenly slows down, it is considered to be the best k value
SC silhouette coefficient: The value is [-1, 1]. The larger the value, the better. If it is a negative value, the sample may be classified incorrectly.
CH coefficient: The higher the score s, the better the clustering effect.
Algorithm optimization
Canopy algorithm with initial clustering
K-means
Bipartite k-means
ISODATA
kernel kmeans
Mini-batch K-Means
Feature dimensionality reduction
Feature selection
Remove low variance features
Pearson correlation coefficient
Spearman correlation coefficient
Principal Component Analysis PCA
Case: Exploring user preferences for item categories and dimensionality reduction