MindMap Gallery CTT, reliability
Classic test theory (CTT) and measurement reliability in psychometrics are two core concepts, which together form the basic framework of psychometrics. Below is a detailed explanation of these two concepts.
Edited at 2024-11-13 11:42:33これは、「Amazon Reverse Working Method」「Amazon Reverse Working Method」に関するマインドマップです。それは、Amazonの成功の秘密を明らかにし、実用的な作業方法と管理の原則を提供し、Amazon文化を理解し、仕事の効率と創造性を向上させたい読者にとって大きな参照価値です。
Azure BlobストレージにおけるMicrosoftの顕著な進歩とイノベーション、特にChatGptの作成者であるOpenaiの巨大なコンピューティングニーズを効果的にサポートする方法に焦点を当てています。 Azure Blobストレージ製品管理チームのJason Valerieは、JakeとDeverajaと協力して、Azure BlobストレージがOpenaiの大規模なモデルトレーニング、処理データ、ストレージをexebbitレベルまでに行う上で重要な役割を果たしました。議論には、AIワークロードのスケーリングスーパーコンピューターが直面している課題と、地域ネットワークゲートウェイを接続するデータセンターなどのアーキテクチャソリューション、および動的ストレージ容量の拡張を可能にする拡張アカウントの導入が含まれます。技術的な側面は、チェックポイントのメカニズム、大規模なデータ処理、革新的なブロブビューと階層的な名前空間、グローバルデータモビリティ機能をカバーし、Microsoftのグローバルネットワークインフラストラクチャを戦略的に利用して効率的なデータ送信を可能にします。この会話は、高度なAIの研究開発に強力でスケーラブルで効率的なストレージソリューションを提供するというマイクロソフトのコミットメントを完全に示しています。
これは、主にオブジェクト状態の変化、熱エンジン、内部エネルギー、熱比熱容量、温度スケールを含む、熱に関するマインドマップです。紹介は詳細であり、説明は包括的です。
これは、「Amazon Reverse Working Method」「Amazon Reverse Working Method」に関するマインドマップです。それは、Amazonの成功の秘密を明らかにし、実用的な作業方法と管理の原則を提供し、Amazon文化を理解し、仕事の効率と創造性を向上させたい読者にとって大きな参照価値です。
Azure BlobストレージにおけるMicrosoftの顕著な進歩とイノベーション、特にChatGptの作成者であるOpenaiの巨大なコンピューティングニーズを効果的にサポートする方法に焦点を当てています。 Azure Blobストレージ製品管理チームのJason Valerieは、JakeとDeverajaと協力して、Azure BlobストレージがOpenaiの大規模なモデルトレーニング、処理データ、ストレージをexebbitレベルまでに行う上で重要な役割を果たしました。議論には、AIワークロードのスケーリングスーパーコンピューターが直面している課題と、地域ネットワークゲートウェイを接続するデータセンターなどのアーキテクチャソリューション、および動的ストレージ容量の拡張を可能にする拡張アカウントの導入が含まれます。技術的な側面は、チェックポイントのメカニズム、大規模なデータ処理、革新的なブロブビューと階層的な名前空間、グローバルデータモビリティ機能をカバーし、Microsoftのグローバルネットワークインフラストラクチャを戦略的に利用して効率的なデータ送信を可能にします。この会話は、高度なAIの研究開発に強力でスケーラブルで効率的なストレージソリューションを提供するというマイクロソフトのコミットメントを完全に示しています。
これは、主にオブジェクト状態の変化、熱エンジン、内部エネルギー、熱比熱容量、温度スケールを含む、熱に関するマインドマップです。紹介は詳細であり、説明は包括的です。
Classical test theory (CTT), measurement reliability
Classical Test Theory (CTT)
Psychological traits and their measurability hypothesis
psychological traits
Meaning: refers to the unique and relatively stable behavior manifested in a person, such as intelligence, interests, attitude, personality, etc.
Properties: relative stability, abstraction, implicitness, predictability
Measurability of psychological traits:
Psychological traits are an objective existence
Thorndike: Everything that exists objectively has its quantity
McColl: Everything that has quantity can be measured
Measurement errors and their sources
The meaning of psychometric error
Refers to an inaccurate or inconsistent measurement effect caused by changing factors that have nothing to do with the measurement purpose during the measurement process.
Types of measurement errors
Random error: an error caused by accidental factors that have nothing to do with the purpose of measurement and which is not easy to control. Its direction and size changes are completely random.
Systematic error: a constant and regular effect caused by changing factors that have nothing to do with the purpose of measurement, and its magnitude and direction remain unchanged.
Sources of Psychometric Error
1.Measuring Tools
Psychometric scales are unstable (low reliability)
Not really measuring what we want to measure (low validity)
2.Measurement object
The test subject’s true level has not been properly demonstrated
3. Testing process
Physical environment: temperature, light, sound, etc. at the measurement site
Testing time
unexpected interference
Main test factors
4. Rater
Subject effect, subject effect
Experimenter effect: Also known as the experimenter effect, it means that the experimenter may intentionally or unintentionally influence the subjects in some way (such as expressions, gestures, tone, etc.) during the experiment, so that their responses meet the expectations of the experimenter. Therefore , this effect is often called the Rosenthal effect or the expectation effect
Participant effect: Also known as the Hawthorne effect, it refers to the experimental bias caused by the subject's perception of his subject's identity and attitude. Simply put, the subject changes his behavior because he receives extra attention, resulting in A situation in which performance or effort increases
Methods to reduce measurement errors
Measurement Tools: Improving the Reliability and Validity of Measurement Tools
Measurement object: Ensure the normal performance of the measurement object
Testing process: standardization
Raters: Unified grading standards
True fractions and related assumptions
The meaning of true fraction
The value that reflects the true level of a subject's psychological trait is called the true score of the trait; the actual measured score is called the observed score of the trait.
Mathematical model of classical test theory and its assumptions
X=T E (observed score = true score random error score)
①ε(X)=T or ε(E)=0 The expected value of the observed score is the true score, and the expected value of the random error score is 0. Operational definition of a true fraction: the average of the results obtained from numerous measurements.
②ρ(T,E)=0 True fractions and random errors are independent of each other
③ρ(E1,E2)=0 Random errors on each parallel test are independent of each other
parallel test
If two tests with different questions measure the same trait and the question format, number, difficulty, discrimination, and test score distribution are consistent, the two tests are said to be parallel to each other. Parallel test: Two tests that use different questions to measure the same content, and the mean and standard deviation of the test results are the same. Rigorous parallel tests are difficult to construct.
measurement reliability
Reliability overview
Definition of reliability: Reliability is the degree of stability/reliability/consistency of measurement results. The degree of consistency in the results obtained by repeating measurements on the same subjects at different times using the same test (or using another set of equivalent tests).
The role of reliability
One of the important indicators for evaluating test quality
Reflection of the size of random errors existing in the measurement process
Interpreting the meaning of individual test scores—standard error of measurement
Compare the differences in scores on different tests
How to estimate reliability
test-retest reliability
meaning
Test-retest reliability, also called test-retest reliability, refers to the degree of consistency of the results obtained by using the same measurement tool to test the same group of subjects twice under the same conditions, and reflects the results of the measurement tool. Affected by time interval factors. The most appropriate time interval varies depending on the purpose and nature of the test and the characteristics of the test subject. Generally, two weeks to four weeks are suitable, and preferably no more than six months.
Assessment method
The size of the test-retest reliability can be marked by calculating the test-retest coefficient or stability coefficient of the measurement tool. Specifically, it is to obtain the Pearson product-difference correlation coefficient between the scores of the same group of subjects on the two tests.
Application conditions
The individual psychological traits measured by the measurement tool should be relatively stable over time. Such as: personality test There should be no obvious practice effect and forgetting effect on the individual psychological traits measured by the measurement tool, and the effects of practice and forgetting basically cancel each other out. Such as: intelligence test (6 months) No special training or training should be conducted between the two administrations to ensure that the test-retest reliability reflects the influence of random factors.
Replicate reliability
meaning
Duplicate reliability refers to the degree of consistency of the results obtained by two duplicate tests (parallel tests) measuring the same group of subjects. The degree of replica reliability is calculated by calculating the Pearson product-difference correlation coefficient of the scores obtained by the same group of subjects on the two replica tests. Replicate reliability reflects measurement errors caused by differences in questions and time intervals.
Assessment method
Equivalence coefficient: the two duplicate tests are administered simultaneously and continuously;
Stability and equivalence coefficient/test-retest reliability: The two replicate tests are administered twice, separated by a period of time.
order effect
The effect of the presentation order of independent variables on the dependent variable. That is, when the same subjects receive different experimental treatments, the possible impact of the first experimental treatment on the second experimental treatment. The impact may be huge or slight, short-lived or long-lasting.
Balanced design
An experimental design technique that controls the order of experimental treatments to offset sequence errors caused by the order of experimental treatments.
Application conditions
1) Construct two or more truly parallel tests (i.e. Papers A and B); Duplicate or parallel tests: Two tests that use different items to measure the same content and whose test results have the same mean and standard deviation. 2) Subjects must be qualified to accept two tests. (time, money, etc.) 3) You should try your best to explain in detail the time interval between the two tests, the test sequence arrangement, the test experience of the subjects during the test, etc. in the test result report.
internal consistency reliability
meaning
Internal consistency reliability, also called homogeneity reliability, mainly evaluates whether the same psychological traits are measured among the random components of the test, and reflects the degree of sampling consistency of the question content.
Estimation method
split-half reliability
meaning
Split-half reliability refers to the consistency of the scores obtained by all subjects on the two halves after dividing a test into two equal halves. Split-half reliability reflects whether the items in the two random components of the test measure the same psychological trait.
split in half method
Odd and even half method
Assessment method
Spearman-Brown formula Flanagan formula Lu Lun formula
Library-theoretical reliability
Cronbach's coefficient
Hoyt reliability
inter-rater reliability
meaning
Interrater reliability refers to the degree of consistency with which multiple raters rate responses from the same group of people. Generally, the average consistency between pairs of trained raters is required to be above 0.90 before the rating is considered objective.
Assessment method
Two raters: Calculate the correlation coefficient (Pearson product-difference correlation or Spearman rank correlation) between the scores given by the two raters to the same batch of subjects' answer sheets.
More than two raters: estimated using Kendall's concord coefficient.
Ways to improve reliability
Factors affecting measurement reliability
Subject characteristics
single subject
Exam motivation
test anxiety
Quiz experience
practice effect
response tendency
Physiological variables
Heterogeneity of the subject group, average ability level of the subject group
Main test characteristics
tester
grader
Testing situation
measuring tools
Test length
Test difficulty
time interval
Ways to improve reliability
Increase the length of the test appropriately
Control the difficulty distribution of test questions
Try to improve the discrimination of each question
Select the appropriate subject group
Standardize the testing process and unify the testing environment
Ensure sufficient time for subjects to answer questions
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme
floating theme