Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

MindMap Gallery CFA Quantitative Analysis (1)

CFA Quantitative Analysis (1)

The content of quantitative analysis (1) includes linear regression, multiple linear regression, and time series. To be updated later

Edited at 2019-12-30 14:46:25

PlotWizard

Recent works View more works>>

CFA Quantitative Analysis (1)

PlotWizard

Recent works View more works>>

Recommended to you
Outline

Quantative Methods (quantitative analysis)

Tradition (study session 2)

Regression

Linear regression

Model

Model components

variables(variables)

corss-sectional data

time-series data

regression coefficients (Regression coefficients)

After solving b1, use the mean (X, Y) to solve b0

estimated regression coefficient

linear least squares (least squares method)

Minimum value:

(dependent variable – predicted value of dependent variable)2

estimated parameters or fitted parameters:

Note that we never observe the population parameter values b0 and b1 in a regression model.

error term (residual error)

excel model

Assumptions (6 items)

1.Y,X are straight lines with b0 and b1 as parameters (X is linear in the parameters b0 and b1), multiple powers of

Linear regression cannot be used

Linear regression analysis can be used

2.X, is not random

The error term ε is a random variable with an expected value of zero.

Ensure that the correct b0, b1 are calculated

Homoscedasticity Assumption (homoskedasticity assumption)

The error terms ε are independent of each other

6. ε, normally distributed

Check whether there are specific estimated parameters

SEE(Standard Error of Estimate)

Measure how accurately the regression model describes the relationship between variables

The coefficient of determination (decisive factor)

Univariate

Square of correlation coefficient, r^2

Multivariate

total variation=

RSS(SSr): sum of squares of regression

unexplained variation=

SSe: sum of squares of error

Total variation = Unexplained variation Explained variation

The larger the coefficient of determination, the better the fitting effect.

Hypoththesis testing

H0:ρ=0, H1:ρ≠0. H0:null hypothesis,H1:alternative hypothesis

confidence interval

Outside the confidence interval, reject H0.

t-test

reject H0

The larger the t-value, the better

we can reject the hypothesis that the true parameter is equal to 0 at the 0.5 percent significance level (99.5 percent confidence).

p-value

The p-value is the smallest level of significance at which the null hypothesis can be rejected. Reject the minimum value of H0.

The smaller the p-value, the better. The usual reference value is 0.05

The significance level (significance level) is the probability of making an error when the estimated overall parameter falls within a certain interval, represented by α. --Baidu Encyclopedia

error type

Analysts often choose the 0.05 level of significance, which indicates a 5 percent chance of rejecting the null hypothesis when, in fact, it is true (a Type I error)

Analysis of variance (ANOVA)

Analysis of variance (ANOVA) is a statistical procedure for dividing the total variability of a variable into components that can be attributed to different sources.

F-test

The F-statistic tests whether all the slope coefficients in a linear regression are equal to 0.

H0: b1 = 0, Ha: b1 ≠ 0

The bigger the F, the better

subtopic

Prediction Intervals

two sources of uncertainty

the error term itself contains uncertainty.

estimated parameters

limitations

parameter instability

regression relations can change over time, just as correlations can.

public knowledge

public knowledge of regression relationships may negate their future usefulness.

assumptions are violated

hypothesis tests and predictions based on linear regression will not be valid

Multiple Linear Regression

Introduction

t-test

ANOVA

two types of uncertainty:

SEE (standard error of estimate): uncertainty in the regression model itself

b0,b1: uncertainty about the esti mates of the regression coefcients.

As the number of independent variables Xi increases, R^2 will increase and the reliability of R^2 decreases. At this time, it is necessary to compare it with adjusted R^2.

Dummy Variables

Dummy variables in a regression model can help analysts determine whether a particular qualitative independent variable explains the model’s dependent variable. Can a qualitative independent variable explain the dependent variable?

value(0,1)

To confirm in n categories, n-1 dummy variables are required

The intercept represents the mean value of Y corresponding to the omitted category X, and the slope represents the incremental effect of each category on Y (incremental effect).

Similar to linear regression on one variable

Assumptions and Violations

Asumtions

A linear relation exists between the Xj and Y.

Xj are not random;no exact linear relation exists between Xj,Xk

homoskedasticity

ε is uncorrelated across observations.

ε is normally distributed

Violations

heteroskedasticity

The variance of the error term is different from the variance of the observations

the variance of the errors differs across observations

no conditional heteroskedasticity

conditional heteroskedasticity

Breusch–Pagan test

serial correlation (autocorrelated)

The error term is related to the observation

regression errors are correlated across observations

Positive serial correlation

The variance will decrease

t-statistics:inflates

F-statistic:inflates

Durbin–Watson statistic (DW)

DW=2*(1-r)

The value of DW is between 0-4

Reference value: DW=2,

DW deviates too far from DW=2, indicating a sequence correlation problem

multicollinearity

There is a linear relationship between the two independent variables

One or more independent variables X are highly correlated

not perfectly related

t-statistics: not significant t value is small

F-statistic: significant, large F value

F test and t test Baodun, F is big and t is small

The coefficient of determination R^2 will increase

The variance of individual slope coefficients increases and the overall variance decreases

Model Specification misSpecification

Model Specification

cogent economic reasoning

The model should be grounded in cogent economic reasoning

functional form .(LN, logarithmic)

The functional form chosen for the variables in the regression should be appropriate given the nature of the variables. (LN, logarithmic)

parsimonious (simple)

The model should be parsimonious.

Little X, big Y, know the subtleties.

assumptions violations

be examined for violations of regression assumptions before being accepted.

useful out of sample

The model should be tested and be found useful out of sample before being accepted.

misSpecification

functional form

variables could be omitted

variables may need to be transformed

pools data from different samples

X correlated with the error term

estimated regression coefficients to be biased and inconsistent

time-series misspecifcation

labeled dependent variables as independent

including a function of dependent variable as an independent variable

independent variables that are measured with error

qualitatively dependent variable

Probit models

based on the normal distribution

logit models

based on the logistic distribution

discriminant analysis

Time Series

Trend Models

linear trend

A fixed amount that grows over time

log-linear trend

have exponential growth (exponential growth)

A fixed rate that grows over time

predicted trend value of yt

growth rate

Linear trend regression will have the problem that the regression error is related to the observed value. The log will be corrected some, but it is not solved.

Testing for Correlated Errors

DW-test

H0: There are no sequence-related problems,

The premise of using the trend model is that the covariance is stationary. If the covariance is not stationary, the model will be invalid.

Autoregressive (AR) Time-Series Models

We must assume that the time series we are modeling is covariance stationary

Covariance-Stationary Series

the expected value of the time series must be constant and finite in all periods

the covariance of the time series with itself for a fixed number of periods in the past or future must be constant and finite in all periods

the variance of the time series must be constant and finite in all periods

How to check whether the covariance is stationary? Look directly at the plot, if the plot shows the same mean and variance

Autocorrelation coefficient for all lag values = 0

A random walk

Previous value Unpredictable random term

previous period plus an unpredictable random error

incovariance stationary

If the time series is a random walk, it is not covariance stationary

Random walk with drift

A random walk with drift is a random walk with a nonzero intercept term.

Has a unit root

All random walks have unit roots.

If a time series has unit roots, it is impossible to have stationary covariances

Treatment of unit roots

first-differencing the time series; (first-order splitting), perform autoregressive estimation of the sequence after first-order splitting.

Moving-Average Time-Series Models

moving average

Lags behind actual data and plays a role in smoothing data (such as smoothing seasonal fluctuations)

Because of the lag, the prediction effect cannot be achieved.

MA(1) model

MA(q)：A qth order moving-average model

ts first q autocorrelations are nonzero while autocorrelations beyond the first q are zero.

ARMA models

autoregressive moving average models

the parameters in ARMA models can be very unstable;

determining the AR and MA order of the model can be difficult;

ARMA models may not forecast well

ARCH

Autoregressive conditional heteroskedasticity modelAutoregressive conditional heteroskedasticity model

If the coefficient on the squared residual is statistically significant, the time-series model has ARCH(1) errors

if a time-series model has ARCH(1) errors

Multivariate time series problem

All timelines have no cell roots and regression is available

If neither of the time series has a unit root, then we can safely use linear regression.

Only one of the time series has a unit root, can regression be used?

If one of the two time series has a unit root, then we should not use linear regression

All series have unit roots, and time series are cointegrated, regression is available

If both time series have a unit root and the time series are cointegrated, we may safely use linear regression

All series have unit roots, and the time series is not cointegrated, so regression is not available.

however, if they are not cointegrated, we should not use linear regression

(Engle–Granger) Dickey–Fuller test cointegration test

The (Engle–Granger) Dickey–Fuller test can be used to determine if time series are cointegrated

Some issues with time series

Covariance stationarity will form mean reverting

Compare the accuracy of different regression models

The root mean squared error (RMSE) root mean squared error: Error squared and square root of mean

The smaller the better

The parameters of the time series model will be unstable. When using the time series model for estimation, it is necessary to check whether the time series is stable.

The steps of time series forecasting

Understand your investment problem and choose an initial time series model

regression model

Use one variable to predict another variable

time-series model

Predict the same variable using previous data on the same variable

If you use a time series model, first draw a graph to see if the covariance is stationary.

Does not contain

a linear trend a linear trend

an exponential trend an exponential trend

seasonalality

There is an idle deviation in the data within the sample interval, a significant shift in the mean or covariance.

step

Draw a graph to check whether a linear trend or an exponential trend makes the most sense

Estimate trend parameters

Calculate remaining residuals

Durbin–Watson statistic detection sequence related issues

if it does not exist

Model available

if exists

Use autoregressive model autoregressive model

autoregressive model

Treatment of covariance stationarity violations of stationarity

a linear trend,

first-difference the time series.

exponential trend

take the natural log of the time series and then first-difference it

shifts significantly during the sample period

estimate different time-series models before and after the shift

significant seasonality

include seasonal lags

Construction of autoregressive model

Estimate an AR(1) model

Test to see whether the residuals from this model have significant serial correlation. If there is no sequence correlation problem, AR(1) can be used.

If there is a sequence correlation problem, use AR(2) for further estimation and repeat the previous steps. Until there are no sequence problems.

Check for seasonal issues

Method 1: Draw and observe

Method 2: Examine the data to see whether the seasonal autocorrelations of the residuals from an AR model are significant (for example, the fourth autocorrelation for quarterly data)

To correct for seasonality, add seasonal lags to your AR model. For example, if you are using quarterly data, you might add the fourth lag of a time series as an additional variable in an AR(1) or an AR(2) model .

Detecting heteroskedasticity problems conditional heteroskedasticity

ARCH(1)

Regress the squared residual from your time-series model on a labeled value of the squared residual.

Test whether the coefficient on the squared lagged residual differs significantly from 0

If the coefficient on the squared lagged residual does not differ significantly from 0, the residuals do not display ARCH and you can rely on the standard errors from your time-series estimates.

use generalized least squares or other methods to correct for ARCH

out-of-sample forecasting performance

subtopic

FinTech (study session 3)

Machine Learning

Big Data Projects

Probabilistic Approaches

scenario analysis

Decision Trees

Simulation

theme