Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

MindMap Gallery CFA Level 2 Quantity Summary

CFA Level 2 Quantity Summary

Self-made mind map, suitable for JC online courses. All the knowledge points have been sorted out for everyone, so that you can browse and check them when preparing for the exam, which will help you deepen your memory and improve your review efficiency. The places that need to be remembered have been marked with symbols and fonts of different colors. I hope it will be helpful for you to prepare for the exam.

Edited at 2021-05-31 16:37:12

PlotWizard

Recent works View more works>>

CFA Level 2 Quantity Summary

PlotWizard

Recent works View more works>>

Recommended to you
Outline

CFA Level 2 Rights Mind Map
- 24
- 1
PlotWizard
CFA Level 2 Portfolio Summary
- 58
PlotWizard
CFA Level 2 Alternative Summary
- 22
PlotWizard
CFA Level 2 Economics
- 27
- 1
PlotWizard

quantity

Regression

linear

Modeling

Calculation: b1(OLS),b0

X:indepentent variable;

Y:depentent variable;

assumption

E(Error term)=0

analyze

1.ANOVA

ESS

Sum of squares of Yi-Y

the sum of squared errors or residuals

error

df=n-k-1

RSS

Sum of squares of Y pre-Y means

the regression sum of squared

regression

df=k

TSS

Sum of squares of Yi-Y means

total

df=n-1

MSS(mean sum squares)

MSR

RSS/K

MSE

ESS/n-k-1

sample covariance=(X-X mean)(Y-Y mean)/(n-1)

2.R-Squared

Coefficient of determination

1. calculate

RSS/TSS=explained variation/total variation

2. explain

How much variation in y can the independent variable explain?

The larger the value, the better the fit

3. relationship with correlation coefficient,multiple R

3.SEE

Stardard error of estimate/regression

The smaller the value, the better the fit

Essence: standard deviation of the error term

SEE df=df of standard deviation of b1 (or slope coefficient)

Formula: ESS/n-k-1 under the root sign

test

1. Parameter estimation

point estimate

CI build

2. Hypothesis testing

Significance test of b1

H0:b1=0

The df of b1 is n-k-1

Involving the greater than or less sign, Bezawa assumes that it is greater than a certain value, and the rejection region is at the right end. and single-tailed table lookup problems

t-test

if reject null,means that b1 is significance

The F value of the in-table t test in the question is for the case where the null hypothesis is equal to zero.

The smaller the P value, the more rejected it is, which is compared with the significance level.

Predict Y

point estimate

Bring into calculation

ypre-kS(ypre)

Sf=standard error of the forecast=yforecast

How to find Sf: n tends to infinity, Sf is approximately equal to SEE

K. obeys t distribution, n-k-1

multiple

1.Difference from linear

Explanation of b1

hypothetical test

t-test

Univariate regression (1), multiple regression (k)

F-test

assumption

H0:b1=b2=...bk=0

Ha: At least one bi is not equal to 0

F=(RSS/K)/(ESS/N-K-1)

The bigger the better

draw distribution

Right deviation

judge

When F falls in the rejection region, accept Ha. access the effectiveness of the model as a whole

R-squared

The difference between monism and plurality

Explanation of R2;

Explanation of R2; explanation of correlation

whole

shortcoming

Adjusted-R2

Formula: 1-(ESS/N-K-1)/(TSS/N-1)

Analysis: k increases, R2 ,A-R2 -

2.Dummy variables

n categories,n-1 dummy variables

The meaning of coefficient bi

eg:R1-R4

3. Violation of assumptions

heteroskedasticity

meaning

the variance of the term is not constant

conditional- (discussed)

error variance is related to the independent variable

as a result of

not affect

consistency

coefficient estiamtes bipre

affect

Sbi pre

t/F-test

test

BP test(other side)

BP=n*R squared (error term)

Correction

white

generalized least squares

serial correlation

meaning

regression errors are correlated with one another

often found in time series data

positive-

common

as a result of

not affect

consistency

coefficient estiamtes bipre

affect

Sbi pre

t/F-test

test

DW test

calculate

DW=2(1-r)

Assume a hypothesis; calculate; draw a distribution; judge

Correction

Henson method

Autocorrelation & Conditional Heteroscedasticity

Newey-West method

multicollinearity

meaning

Xi,Xj highly correlated

as a result of

not affect

consistency

affect

coefficient estiamtes bipre

Sbi pre

t/F-test

test

pairwise correlations

correlation among Xi,Xj

classic method satisfies both

t-test failed; F-test passed; R2 high

Correction

excluding

4.model misspecification

cause the estiamted regression coefficients inconsistnet

7 situations

5.qualitative dependent variables

dependent variables is dummy variables

2 forms

probit and logit models(logic)

maximum likelihood methods

discriminant models (discriminant)

Z-scored

time-series analysis

1.trend models

linear -

log-linear -

2.autorgressive model (AR)

Condition 1

No autocorrelation

test

H0:correlation(error t ,error t-k)=0

t-test

Sr=1/n under root sign

The larger t is, the null hypothesis is rejected and Ha has autocorrelation.

Correction

Add relevant labeled value as variable

Condition 2

covarience-staionary

Conditions: mean,variance,covariance constant

mean-reverting

Calculation: mean yt=bo/(1-b1)

predict

b1 is not equal to 1

random walk

b1 is equal to 1

is not covarience-staionary

mean-reverting test

DF-test

Check unit root (b1=1)

Essential t-test

H0;g=0(g=b1-1) has unit root; Ha:g<0

Correction

if has unit root

first-difference

Condition three

no conditional heteroskedasticity

conditional heteroskedasticity

The variance of the residual term is related to t

test

ARCH

AR model for the variance of the residual term

For ARCH(1), if a1=0, there is unconditional heteroskedasticity. On the contrary, there is.

Correction

generalized least squares

3. Multiple sets of time series data

Modeling conditions

5 situations

2 types available

cointegration test

DF-EG test

Quantity 2

1. machine learning

1.overview

X feature;Y target variable

type

data sets

training sample

validation sample

test sample

overfitting

Features

much complexity

bias error is low, variance error is high

fitting curve

optimal level：minimize total level

preventing overfitting

penalty

corss-validation

k-fold

2.supervised ML

2.1 penalized regression

LASSO

The larger r is, the greater the punishment

2.2 SVM (classification)

concept

support vectors; discriminant boundary; margin

maximum margin

soft margin classification

1.add penalty

2.For non-linearity, increase features and increase complexity

uesd for classification, regression, outlier detection

2.3 KNN (discrimination classification)

concern

define similar

value of k

2.4 CART

Can be discrete or continuous

visual explanation

frame

initial root node; decision nodes; terminal nodes

goal

Classification

The minority obeys the majority

return

Average the final value

avoid overfitting

1.regularization;2pruned

2.5 ensemble

voting classifiers

Same training set data, different models

result:

The minority obeys the majority

bootstrap aggregating/bagging

Same model, different training set data

resample

result

Classification

The minority obeys the majority

return

Average the final value

application

random forest

drawback:blackbox

Each random trial produces a tree

3.unsupervised ML

dimension reduction

PCA

process

Build composite; define eigenvector; eigenvalue; first principal component

shortcoming

black box

clustering

k means

shortcoming

depend on centroids

Run multiple times

set range

hierarchical level

1.agglomerative (bottem up); 2divisive (top down)

4.neural networks

concept

nonlinear; complex

type of layer

input;hidden;output

hidden function

summation operator: Integrate (randomly assign weight sum) into total net input

activation function

deep learning

The hidden layer is at least 3 layers, usually more than 20 layers

reinforcement learning

maximize its rewards

2. big data

1. Basic concepts

Features

volume;variety;velocity;veracity

type

structured data

unstructured data

2.structured data

2.1 conceptualization of the modeling task

determining what the output

2.2 data collection

internal source

enteral source

API "interface"

2.3 data preparation(cleansing) and wrangling(preprocessing)

cleansing 6 errors

incompleteness

invalidity

outside a meaningful range

inaccuracy

outside a meaningful range

inconsistency

data conflict

non-uniformity

not present in an identical format

duplication

preprocessing

transformations

extraction

aggregation

filtration

row

selection

column

conversion

eg: Currency unit conversion

scaling

normalization

(Xi-Xmin)/(Xmax-Xmin)

Advantages: any distribution; Disadvantages: sensitive to outliers

standardization

Advantages: Insensitive to outlier; Disadvantages: Normal distribution

handling outlier

trimming

removed (eg: truncated average score)

replaced

replace (eg:winsorization)

2.4 data exploration

EDA exploratory data analysis

tools

summary statistics

visualizations

eg;histogram,scatterplot

feature selection

feature decrease

feature engineering

feature increase

2.5 model training

selection method

performance selection

error analysis

Calculate precision, recall, accuracy, F1 score

ROC

The more curved the better; the larger the AUC the better

RMSE

The smaller the better

tuning

Adjust parameters (depending on the training set); hyperparameters minimize the total error (bias variance)

method: grid search; ceiling analysis

3.unstructured data

3.1 text problem formulation

3.2 data collection(curation)

Same as the first 2 steps of structured data

3.3 data preparation and wrangling(preprocessing)

cleansing

remove HTML tags/punctuations/numbers/white space

nou remove all

number, punctuation

preprocessing

tokenization(mormalization)

low secasing

Convert to lowercase

remove stop words

stemming

lemmatization

BOW(bag of words)

N-grams

DTM (document term matrix)

cell value

goal:digitization

3.4 data exploration

EDA

word cloud

feature selection

BOW decrease

The bigger the Method, the better

frequency; Chi-square; mutual information

feature engineering

feature increase

technique

number; N-gram; name entity recognition; parts of speech

3.5 model training

3. probabilistic approaches

1.simulation

steps

advantage

better input estimation

yield a distribution

issue

GOGI(garbage in,garbage out)

input,model wrong

real data may not fit distribution

non-stationary

changing correlation across inputs

risk-adjusted value and simulation

Consider risk, numerator or denominator, avoid double counting

2. Comparison of methods

full risk analysis

scenario analysisi

Failure to conduct a comprehensive risk analysis

decision tree

All risks within the scope of management are considered

simulation

all circumstances considered

type of risk

scenario analysisi

discrete results

decision tree

discrete results

simulation

Continuous results

correlation across risk

scenario analysisi

Consider correlation. But very subjective

decision tree

difficult to consider

simulation

Correlation can be explicitly considered in the model as a variable

the quality of information

scenario analysisi

The situation considered is relatively simple and the requirements for information quality are low.

decision tree

The situation considered is relatively simple and the requirements for information quality are low.

simulation

The amount of data required is large and the quality of information is high.

complement or replacement for risk-adjusted value

scenario analysisi

Only as a supplement

decision tree

Can be used as a supplement or as an alternative

simulation

Can be used as a supplement or as an alternative

Quantity 2

1. machine learning

1.overview

X feature;Y target variable

type

data sets

training sample

validation sample

test sample

overfitting

Features

much complexity

bias error is low, variance error is high

fitting curve

optimal level：minimize total level

preventing overfitting

penalty

corss-validation

k-fold

2.supervised ML

2.1 penalized regression

LASSO

The larger r is, the greater the punishment

2.2 SVM (classification)

concept

support vectors; discriminant boundary; margin

maximum margin

soft margin classification

1.add penalty

2.For non-linearity, increase features and increase complexity

uesd for classification, regression, outlier detection

2.3 KNN (discrimination classification)

concern

define similar

value of k

2.4 CART

Can be discrete or continuous

visual explanation

frame

initial root node; decision nodes; terminal nodes

goal

Classification

The minority obeys the majority

return

Average the final value

avoid overfitting

1.regularization;2pruned

2.5 ensemble

voting classifiers

Same training set data, different models

result:

The minority obeys the majority

bootstrap aggregating/bagging

Same model, different training set data

resample

result

Classification

The minority obeys the majority

return

Average the final value

application

random forest

drawback:blackbox

Each random trial produces a tree

3.unsupervised ML

dimension reduction

PCA

process

Build composite; define eigenvector; eigenvalue; first principal component

shortcoming

black box

clustering

k means

shortcoming

depend on centroids

Run multiple times

set range

hierarchical level

1.agglomerative (bottem up); 2divisive (top down)

4.neural networks

concept

nonlinear; complex

type of layer

input;hidden;output

hidden function

summation operator: Integrate (randomly assign weight sum) into total net input

activation function

deep learning

The hidden layer is at least 3 layers, usually more than 20 layers

reinforcement learning

maximize its rewards

2. big data

1. Basic concepts

Features

volume;variety;velocity;veracity

type

structured data

unstructured data

2.structured data

2.1 conceptualization of the modeling task

determining what the output

2.2 data collection

internal source

enteral source

API "interface"

2.3 data preparation(cleansing) and wrangling(preprocessing)

cleansing 6 errors

incompleteness

invalidity

outside a meaningful range

inaccuracy

outside a meaningful range

inconsistency

data conflict

non-uniformity

not present in an identical format

duplication

preprocessing

transformations

extraction

aggregation

filtration

row

selection

column

conversion

eg: Currency unit conversion

scaling

normalization

(Xi-Xmin)/(Xmax-Xmin)

Advantages: any distribution; Disadvantages: sensitive to outliers

standardization

Advantages: Insensitive to outlier; Disadvantages: Normal distribution

handling outlier

trimming

removed (eg: truncated average score)

replaced

replace (eg:winsorization)

2.4 data exploration

EDA exploratory data analysis

tools

summary statistics

visualizations

eg;histogram,scatterplot

feature selection

feature decrease

feature engineering

feature increase

2.5 model training

selection method

performance selection

error analysis

Calculate precision, recall, accuracy, F1 score

ROC

The more curved the better; the larger the AUC the better

RMSE

The smaller the better

tuning

Adjust parameters (depending on the training set); hyperparameters minimize the total error (bias variance)

method: grid search; ceiling analysis

3.unstructured data

3.1 text problem formulation

3.2 data collection(curation)

Same as the first 2 steps of structured data

3.3 data preparation and wrangling(preprocessing)

cleansing

remove HTML tags/punctuations/numbers/white space

nou remove all

number, punctuation

preprocessing

tokenization(mormalization)

low secasing

Convert to lowercase

remove stop words

stemming

lemmatization

BOW(bag of words)

N-grams

DTM (document term matrix)

cell value

goal:digitization

3.4 data exploration

EDA

word cloud

feature selection

BOW decrease

The bigger the Method, the better

frequency; Chi-square; mutual information

feature engineering

feature increase

technique

number; N-gram; name entity recognition; parts of speech

3.5 model training

3. probabilistic approaches

1.simulation

steps

advantage

better input estimation

yield a distribution

issue

GOGI(garbage in,garbage out)

input,model wrong

real data may not fit distribution

non-stationary

changing correlation across inputs

risk-adjusted value and simulation

Consider risk, numerator or denominator, avoid double counting

2. Comparison of methods

full risk analysis

scenario analysisi

Failure to conduct a comprehensive risk analysis

decision tree

All risks within the scope of management are considered

simulation

all circumstances considered

type of risk

scenario analysisi

discrete results

decision tree

discrete results

simulation

Continuous results

correlation across risk

scenario analysisi

Consider correlation. But very subjective

decision tree

difficult to consider

simulation

Correlation can be explicitly considered in the model as a variable

the quality of information

scenario analysisi

The situation considered is relatively simple and the requirements for information quality are low.

decision tree

The situation considered is relatively simple and the requirements for information quality are low.

simulation

The amount of data required is large and the quality of information is high.

complement or replacement for risk-adjusted value

scenario analysisi

Only as a supplement

decision tree

Can be used as a supplement or as an alternative

simulation

Can be used as a supplement or as an alternative