This is a mind map about artificial intelligence. -A straight line can divide space into two areas, which is equivalent to dividing it into two categories. Determine attributes based on the area where the data is located
Edited at 2022-05-24 14:51:01This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
AI
predict
prediction space
Assuming different attribute values, what kind of labels will result?
Sample space
Composed of several samples
The composition of a sample
Attributes
Label
Attribute (x)->label (y) is the predicted category given by the computer based on the attribute
Classification
definition
The output is the category of the object
K nearest neighbor algorithm
Method steps
1. Enter the sample to be classified
2. Select K samples that are closest to the sample to be predicted.
Determining factor: distance d
1.Euclidean distance
2.Manhattan distance
3. Hamming distance
3. Vote on the K selected samples to determine the category
1. Direct voting method
2. Weighted voting method
Such as wi=1/d(x1,xi)
4. Output the predicted category
shortcoming
The k value should be chosen appropriately. If k is too small, fewer data points will be involved, and the results will be easily affected by noise points; if k is too large, dissimilar samples will also participate in the decision-making, affecting the accuracy of the decision-making.
return
definition
The output is a definite value
linear regression
Least squares method y=wx b
w
b
application
Two-category application
A straight line can divide space into two areas, which is equivalent to dividing it into two categories. Determine attributes based on the area where the data is located
linear discriminant analysis
Determining which category the data points belong to is based on the projection of the data point on the linear straight line. This requires that the projections of the same category should be close (the inclination variance within the same category is small), and the projections of different categories should be far apart (the center points of different categories far apart)
Teardown and Prediction
Multi-category applications
Error Correction Output Code (ECOC)
Encoding refers to the attribute value obtained by the category in the classifier, category (x)--->classifier (mapping relationship f)-->attribute value (y)
For a classifier, the attribute values it assigns to different categories are different, which is equivalent to a mapping relationship f. For different x, the mapped y is different.
The attribute values of the same category on different classifiers can be the same or different, which is equivalent to the y obtained by the same x under the mapping relationship f, which can be the same or different.
There are several classifiers, and there are several attribute values.
OvO(one vs one)
Each classifier can only distinguish two categories. For example, f1 can only distinguish types C1 and C2.
To distinguish n categories, n(n-1)/2 are needed
The characteristic of encoding is that there are only 1 and -1, and there is no such neutrality as 0.
OvR(one vs rest)
Each classifier only determines whether it is Ci. For example, f1 is used to determine whether it is the C1 category or a category other than C1.
Distinguishing n categories requires n
The characteristic of encoding is that there are only 1 and -1, and there is no such neutrality as 0.
MvM (many vs many)
The categories are randomly divided into 2 groups, and then one group has at least 2 categories. The classifier outputs which group of categories the sample to be tested belongs to.
The characteristic of encoding is that there are only 1 and -1, and there is also neutrality such as 0.
The Hamming distance between "0" and "0" "1" "-1" is 0.5
decision tree
Information entropy Entropy(S) (Shannon entropy)
meaning
It is used to measure the purity of the information contained in S. The smaller the value, the fewer categories of information it contains, and the purer it is; the larger the value, the more categories it contains, and the less pure it is.
formula
where n is the total number of categories in S
Example
Information gainGain(A,S)
Definition: The entropy value that can be reduced by dividing S based on attribute A (reducing uncertainty)
formula
Example
Coding meaning:
Information gain rate Gain_ratio
Decision tree generation ID3 algorithm
First generate the root node representing the entire training set
If the samples corresponding to a node all belong to the same category, then the node will be regarded as a leaf node.
Otherwise, use information gain to select the current optimal sample partitioning attribute as a child node of the root node.
Create a branch for each value of the partition attribute to partition the sample
Iterate 2–4 until there are no remaining attributes that can be used to further divide the sample
Create a leaf node with the majority sample class at the current non-leaf node branch
The sample space is obtained based on the actual situation, so in each sample, each attribute must have an exact value, and there is no "*" value.
Assuming that in addition to the exact value of the spatial attribute, we can also assume that the attribute is unlimited and marked with "*", thinking that no matter what value it takes, it will have no impact on the predicted label value.