MindMap Gallery Certified Financial Analyst CFA Level 2 Quantitative method knowledge framework
The Certified Financial Analyst Level 2 subject, quantitative methods, fully covers the syllabus mind map and key details of the test points.
Edited at 2022-03-10 12:05:13This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
Quantitative Methods
Quantitative Method (1)
Linear regression
Assumptions
Linearity: The linear relationship between Y and b1
Homoskedasticity ---heteroskedasticity problem
Independence------Sequence related issues
Normality
Point estimate
b0, b1 formula
Calculator calculates b0, b1
Confidence interval estimate
confidence interval calculation
The standard error of b1 can be obtained from Back inference of t.stat results in ANOVA TABLE
Test of regression coefficient
significant test about regression coefficient
t-statistics
p-value
confidence interval method
hypothesis test about regression coefficient ---t-statistics
significant test for correlation
F-test: Ho: b1=b2=…=bk=0
ANOVA Table
SS, df, MSS,
SEE
Coefficient of determination R2, Multiple R
Estimate of Y
point estimate
confidence interval estimate
Different functional form: log
Limitations of regression
relation can change over time
public knowledge of regression
regression assumptions
Multiple regression
The difference between multiple vs linear regression
Partial regression coefficient: holding other constant
no exact linear relations: multicollinearity if violated
Hypothesis test
Single parameter bi: t-test with different degrees of freedom
F-test: k is different
F vs T-test
Unit: t^2=F
Multivariate: F considers the correlation between x and affects the results of t and F
Adjusted R-square
formula
R2>adjusted R2
may be<0
Dummy variable:X
n~n-1dummy variable
Intercept and estimated coefficient explained
Explanation of the conclusion of t-test
Assumption violations
Heteroskedasticity
unconditional vs conditional
Effect
does not affect
Point estimation result b
Consistency of parameter estimates
Influence
Interval estimate t·Sb
test
t-stat.--Sb
MSE is too small, P (type I error) increases
MSE is too large, P (type II error) increases
F stat.--MSE
MSE is too small, P (type I error) increases
MSE is too large, P (type II error) increases
Detecting
scatter plot
BP test
Ho: no heteroskedasticity
Chi-square test: BP=n·R(residual)^2
Chi-square distribution, df=k, one-tailed
Correcting
roubust standard errors (White-corrected standard errors)
generalized least squares
Serial correlation(autocorrelation)
Postive vs Negative SC
Effect
does not affect
Point estimation result b
Consistency of parameter estimates
Influence
Interval estimate t·Sb
t/F test results
Positive SC, MSE is small, P(type I error) increases
Negative SC, MSE is too large, P (type II error) increases
Detecting
scatter plot
DW test
Ho: no SC/ no positive SC
DW=2·(1-r)
a, k, n, decision rule
Correcting
Hansen method(roubust standard errors): Note the difference from White method
generalized least squares
Multicollinearity
matter of degree rather than absence or presence
Effect
does not affect
Consistency of parameter estimates
Influence
Point estimation result b
Interval estimate t·Sb
t/F test results
Multicollinearity, MSE is too large, P (type II error) increases
Detecting
Classic method: insignificant t significant F high R-square
Occasionally suggested method: r>0.7
Correcting: exclude one or more regression variables
Model misspecifications---inconsistent
include
incorrect set of variables
incorrect regression equation's functional form
principle of model specifications
economic reasoning
nature of variable
parsimonious
examined for violations
useful out of sample
Classification
Misspecified functional form
omitted variable
inappropriate variable scaling
inappropriate data pooling
Time-series misspecifications
lagged dependent variable as independent variable with serially correlated errors
function of a dependent variable as an independent variable
independent variables are measured with errors
Other types of time-series misspecifications(nonstationary)
relations among time series with trends
relations among time series that may be random walk
Qualitative dependent variable
probit and logit model
Discrimination model: Z-score
Time series analysis
Trend model
linear trend model
Yt equal difference, Y changes at a constant amount: b1
Scattered points approach a straight line
log-linear trend model
Yt is equal to the ratio, Y grows at an exponential rate: e^b1-1
Scatters trend exponentially
limitation
time series data usually exhibit serial correlation, not appropriate for trend model.
Autoregressive model(AR)
multiperiod forecast: chain rule
Assumptions of AR
covariance stationary
Strong stationarity vs weak stationarity
3 conditions for covariance stationary
expected value constant and infinite over time
variance
covariance
nature
stationary past doesn't gurantee stationary in the future
covariance stationary time series has a finite mean-reverting level Xi=B0/(1-B1)
Violated Effect: unit root/B1=1/random walk
Random walk
Random walk with drift
features
no mean reverting level
infinite variance
Detecting
Unit root test of nonstationary: common t-test, Ho: B1=1
Dickey-Fuller test
Xt-Xt-1=bo (b1-1)Xt-1 €
g=b1-1, Ho:g=0, Ha:g<0
revised t-table lookup table
Correcting: first-difference
errors are uncorrelated-violation-autocorrelation
Effect
does not affect
Point estimation result b
Consistency of parameter estimates
Influence
Interval estimate t·Sb: MSE---Sb
t/F test results
Positive SC, MSE is small, P(type I error) increases
Negative SC, MSE is too large, P (type II error) increases
Detecting
DW is not available because the error term correlation should be 0, which is a significance test
T-test: Sr=1/(number of observations)^0.5 Number of observations = sample size - p df=T-k-1 reject Ho, r<>0, autocorrelation exists
Correcting
adds seasonal lag
Homoskedasticity: ARCH
Effect
does not affect
Point estimation result b
Consistency of parameter estimates
Influence
Interval estimate t·Sb
test
t-stat.--Sb
MSE is too small, P (type I error) increases
MSE is too large, P (type II error) increases
F stat.--MSE
MSE is too small, P (type I error) increases
MSE is too large, P (type II error) increases
Detecting
ARCH(1)
significance test for a1
t distribution
Correcting: GLS
More than one time series
Cointegration
DF-EG test: reject Ho, Cointegration, can use multiple regression
Comparing Model Performance
Quantitative
in-sample forecast errors
out-of-sample forecast errors:RMSE
Qualitative
instability of regression coefficient
data from earlier and later
shorter and longer periods of data
Quantitative Method (2)
Machine learning
machine learning V.S statistical approach
Traditional statistics requires an assumed distribution
Data size
linear/nonlinear
Data complexity (dimension)
Name of X,Y
hyperparameters
Types
Supervised learning
labeled training data
Classification
regression model: continuous target variable
classification model
binary classification
multicategory classification
Unsupervised learning
Unlabeled data
Classification
Dimension reduction
clustering
Deep learning & reinforcement learning
Applicable to supervised&unsupervised
based on neural network
deep learning: used for complex tasks
reinforcement: learn from its own prediction errors
overfitting
issue with Supervised Machine Learning
three nonoverlapping data sets
traning sample
validation sample--tuning
test sample--evaluate
three errors
bias error: in-sample error, training data not fits model well, underfitting, high in sample error
variance error: out-of-sample error, overfitting, high out-of-sample error
base error: residual errors, not preventable
fitting curve: optimal complexity of model
addressing method
complexity reduction: overfitting penalty
cross validation
in cross validation
k-fold cross validation
Supervised Learning Algorithms
Penalized regression-regression/continuous
Penalty term: LASSO v.s OLS--linear
Regularizition: applied to non-linear model
Support vector machine(SVM)-classification/distinct
Mechanism: linear, dichotomy, hyperplane, maximum margin, support vector, discriminant boundary
Classification
hard margin: linear classifier
soft margin: not perfectly linear, trade-off between wider margin and classification error
Applicable: small-to-median size & complex high-dimensional data
K-nearest neighbor(KNN)-classification/distinct
Mechanism: linear, classify a new observation by finding similarities, minority obeys the majority
two concerns
hyperparametersk
k too small, high error rate
k too large, dilute the result by averaging
k is even, may no clear winner
hard to clearly define "similar"
Applicable: dichotomy/polychotomy
Classification and regression tree(CART)-regression&classification
mechanism
linear & non-linear
classification tree-categorical target variable regression tree-continuous target variable
no black box
decision tree
features, branches, cut off value
initial root node: widest separation, minimize classification error
decision node: lower within-group error
terminal node: classification error does not diminish much more from another split if Classification----majority of data points if regression----mean of labeled values
Advantages and Disadvantages
Advantages: provide visual explanation
Disadvantages: overfitting; to avoid
regularization
prune low explainatory power section
Ensemble and random forest-combination algorithm
ensemble learning
aggregation of heterogeneous learners
aggregation of homogenous learners: different traning data---bootstrap aggregating (bagging) repeated sampling
random forest
variant of classification tree data bagged from same data set
subset of features used in creating each tree---mitigate overfitting
determine final classification: wisdom of crowd
advantage
protect against overfitting
reduce the ratio of noise to Signal--errors cancel out via different trees
Disadvantages: black box
Unsupervised Learning Algorithms
Dimension reduction: principal components analysis
composite variable, Eigenvectors, Eigenvalue(RSS/TSS)--avoid multicollinearity
Advantages and Disadvantages
Advantages: fewer features, avoid overfitting
Disadvantages: Eigenvectors are a combination of original features, aren't well-defined concept, could be perceived as black box
Clustering
k-means clustering
Mechanism: hyperparameter k, k non-overlapping clusters, centroid
Be applicable
very large data sets
high dimensional data
shortcoming
the choice of hyperparameters k affects outcomes
Solution: using a range of values for k to find optimal number of clusters
hierarchical clustering
no predefined number of clusters
agglomerative (bottom-up) clustering
divisive (top-down) clustering
Neutral Networks
mechanism
artificial neural networks(ANN)
high dimensional data/ linear&nonlinear data
three types of layers
input layers: features
hidden layers: ways of transmitting data
output layer: one prediction result
hyperparameters of 4-5-1
each node
summation operator---total net input
activation operator
transform total net input into final output of the node
light dimmer switch---decrease or increase the strength of input
non-linear & linear
neuron modeling
input, synaptic weights, bias term, total net input, summation operator, active function, output
forward propagation forward calculation
Fix debugging
backward propagation: backward calculation, adjust synaptic weights
revision of hyperparameters based on out of sample performance
Applications
deep neural networks (DNNs)
more than20 hidden layers
useful in general for image, pattern and speech recognition
reinforcement learning: learns based on immediate feedback from (millions of) trials and errors-AlphaGo
Choice of ML algorithms
if data is complex(too many features)
yes
dimension reduction
no
if classification
yes
if supervised
yes
linear: KNN, SVM
nonlinear: CART, random forest, neural networks
no
linear: k-means clustering or hierarchical clustering
nonlinear: neural networks
no
linear: penalized regression
nonlinear: CART, random forest, neural networks
Big data projects
introduction
characters: volume, variety, velocity, veracity(validity), value
data analysis steps: conceptualize model task, data collection, data preparation and wrangling, data exploration, model training
structured data
1. conceptualize task/blueprint/modifiable plan
2. data collection
external data
access through API (application programming interface)
vendor: csv or other formats
internal data
3. data preparation and wrangling
data preparation(cleansing)
incompleteness error
invalidity error
inaccuracy error
inconsistency error
non-uniformity error
duplication error
outliers
trimming(truncation)
winsorization:replace with maximum or minimum value
data wrangling(preprocessing)
transformation
extraction: birthday-age
aggregation: salary revenue=total income
filtration: data rows that are not needed
selection: columns are not needed, eg.name and ID only need one
conversion: CAD-USD
scaling
normalization
Formula: Normalization
Excellent: used when data distribution unknown
Missing: sensitive to outliers
standardization
formula
Excellent: less sensitive to outliers, as it depends on mean and standard deviation
Missing: data must be normally distributed
4. data exploration
exploratory data analysis(EDA)
summary statistics
visualization
feature selection
feature engineering
5.model traning
selection method
performance evaluation
error analysis
confusion matrix
precision, recall, accuracy, F1 score
receiver operating characteristics (ROC)
shape of ROC curve
more convex curve-better
area under the curve(AUC): 0.5 random guessing
root mean square error(RMSE)-useful for regression model
model tuning
minimize total aggregate error
parameters&hyperparameters
altering the hyperparameters
each hyperparameter---confusion matrix
multiple hyperparameters
grid search: different combinations of hyperparameters
ceiling analysis: the part of the pipeline can potentially improve the performance
unstructured data
3. text preparation and wrangling
text preparation(cleansing)
remove HTML tags
remove punctuations: some need replaced with annotations
remove numbers
remove white spaces
text wrangling(preprocessing)
normalization
lowercasing
removal of stop words
stemming
lemmatization
bag-of-words(BOW) procedure: N-grams
document term matrix(DTM)
4. text exploration
EDA
text statistics: term frequency, co-occurrence
visualization
feature selection
reduction in BOW size
methods
document frequency(DF)
Chi-square
mutual information: MI=1, token is more identifiable
feature engineering
number
n-gram
name entity recognition(NER)
parts of speech(POS)