MindMap Gallery medical statistics
This is a mind map about medical statistics. In the medical field, it is a set of concepts, principles and methods for collecting data, analyzing data and drawing conclusions from the data.
Edited at 2023-12-23 18:28:41This Valentine's Day brand marketing handbook provides businesses with five practical models, covering everything from creating offline experiences to driving online engagement. Whether you're a shopping mall, restaurant, or online brand, you'll find a suitable strategy: each model includes clear objectives and industry-specific guidelines, helping brands transform traffic into real sales and lasting emotional connections during this romantic season.
This Valentine's Day map illustrates love through 30 romantic possibilities, from the vintage charm of "handwritten love letters" to the urban landscape of "rooftop sunsets," from the tactile experience of a "pottery workshop" to the leisurely moments of "wine tasting at a vineyard"—offering a unique sense of occasion for every couple. Whether it's cozy, experiential, or luxurious, love always finds the most fitting expression. May you all find the perfect atmosphere for your love story.
The ice hockey schedule for the Milano Cortina 2026 Winter Olympics, featuring preliminary rounds, quarterfinals, and medal matches for both men's and women's tournaments from February 5–22. All game times are listed in Eastern Standard Time (EST).
This Valentine's Day brand marketing handbook provides businesses with five practical models, covering everything from creating offline experiences to driving online engagement. Whether you're a shopping mall, restaurant, or online brand, you'll find a suitable strategy: each model includes clear objectives and industry-specific guidelines, helping brands transform traffic into real sales and lasting emotional connections during this romantic season.
This Valentine's Day map illustrates love through 30 romantic possibilities, from the vintage charm of "handwritten love letters" to the urban landscape of "rooftop sunsets," from the tactile experience of a "pottery workshop" to the leisurely moments of "wine tasting at a vineyard"—offering a unique sense of occasion for every couple. Whether it's cozy, experiential, or luxurious, love always finds the most fitting expression. May you all find the perfect atmosphere for your love story.
The ice hockey schedule for the Milano Cortina 2026 Winter Olympics, featuring preliminary rounds, quarterfinals, and medal matches for both men's and women's tournaments from February 5–22. All game times are listed in Eastern Standard Time (EST).
medical statistics
introduction
What is medical statistics
In medicine, a set of concepts, principles, and methods for collecting data, analyzing data, and drawing conclusions from the data.
Basic content of medical statistics
Basic steps of statistical work
1.Design
2. Collect information
3. Organize information
4. Analyze data
Basic concepts in medical statistics
Homogeneity and variation
homogeneous
It refers to the same or similar nature between observation units or studies, and usually requires the influencing factors of the main research indicators to be the same or basically the same.
Mutations
Refers to the difference between different observation units or individuals in the population for the same measurement.
Variables and data types
variable
It is the abbreviation of random variable, which represents the characteristics, quantity and degree of the observed object. The observed values of a variable are called data, also called variable values.
type of data
Quantitative data (metric data)
Qualitative data (count data)
Ordinal data (semi-quantitative data or hierarchical data)
Pay attention to analysis
Numerical type
There is a unit of measurement
For example: height, weight, blood pressure, temperature, etc.; number of family members, pulse, white blood cell count, etc.
Qualitative
No unit of measurement
For example: gender (male/female), blood type (A/B/AB/O), etc.
Qualitative
Each category differs in degree or order
For example: laboratory results (-/ / /), degree of treatment (significant/effective/improved/ineffective), etc.
population and sample
overall
Refers to the entire research object, which usually consists of all homogeneous observation units or individuals.
sample
Refers to a representative part of observation units or individuals selected from the population, usually obtained by random selection.
parameter
Statistical indicators that describe overall characteristics.
Statistics
Characteristic indicators calculated from samples.
Probability and probability distribution
Probability
A quantitative measure that describes the likelihood of a random event occurring.
Random events
Also called "uncertain events": events that may or may not occur. Contrast with "inevitable event."
Small probability event
It is customary to call an event with P ≤ 0.05 a small probability event, which means that it is very unlikely to occur in a random sampling.
We think it's probably not going to happen
Statistical description
Quantitative data
frequency chart
Frequency table creation steps
1. Determine the number of groups
2. Determine the group distance
3. Determine group limits
4. Determine the group frequency
Uses of frequency distribution tables and histograms
1. As a form of statement of data, it can replace the original data to facilitate further analysis.
2. It is convenient to observe the distribution type of data.
3. It is easy to find some extremely large or extremely small values in the data that are far away from the group.
4. When the sample size is relatively large, the frequency of each group segment can be used as an estimate of the probability.
Teacher PPT version
①Reveal the frequency distribution type (whether it is a normal distribution)
Symmetric and skewed distributions
② Reveal frequency distribution characteristics (average level, degree of variation)
Statistical indicator that describes central tendency
average
It is a statistical indicator that describes the central tendency or average level of a set of observations. Including arithmetic mean, geometric mean and median, etc.
Classification
Arithmetic mean (X)
Suitable for values of quantitative variables that are normally distributed or approximately normally distributed
Population mean μ, sample mean x–
Geometric mean (G)
Suitable for proportional data with a multiple relationship
Calculation formula G=lg⁻¹(∑lgX/n)
Such as antibody titer, serum agglutination titer, bacterial count, concentration of certain substances, etc.
Median and percentile
Median (M)
percentile
Quartile (Q)
P₂₅, P₇₅
percentile
Pₓ
When the data is normally distributed, μ≈M, P₅₀=M
Applicable to 1. There are extra large and extra small values at both ends 2. No exact data at the end of the distribution 3. The overall distribution type is unknown
Statistical indicators that describe the degree of variation
degree of variation
The degree of difference or change (or variation) between a set of observed values
Classification
Extremely poor (R)
Suitable for skewed distributions, the distribution type is unknown
Interquartile range (QR)
Variance (Var)
Suitable for normal distribution
Population variance σ², sample variance s²
Sum of Squares (SS) from Mean
Describes the degree of dispersion of each observation relative to the mean level X–
∑(X-X–)²
degrees of freedom
ν=n-1
It means that among all n squared deviations from the mean, due to the limitation of the sample mean X–, only n-1 sums of squared deviations from the mean are independent.
standard deviation
Population standard deviation σ, sample standard deviation s
Coefficient of variation (CV)
Used to directly compare the degree of variation of two samples without being affected by the average level (or the average of the reference data)
It is a statistical indicator that describes the relative degree of dispersion.
CV=S/X–×100%
Qualitative data
relative number
Rate
It represents the ratio of the number of occurrences of a certain phenomenon to the total number of possible occurrences within a certain space or time range, indicating the intensity or frequency of a certain phenomenon.
Indicates the intensity or frequency of a certain phenomenon within a certain period of time. It is an intensity indicator.
composition ratio
Indicates the proportion of each component of something in the whole, often expressed as a percentage.
Describe the constituent components and serve as constituent indicators.
relative comparison (ratio)
It is the ratio of two related indicator values A and B, used to describe the comparison level between the two.
The two can be absolute numbers, relative numbers or average numbers, and can have the same or different properties.
Commonly used relative indicators
mortality rate
The total number of deaths in a certain place in a certain year / the average annual population of the same place in the same year × 1000%
case fatality rate
The number of deaths due to a certain disease during a certain period / the number of patients with the same disease during the same period × 100%
Incidence
Number of new cases of a certain disease in a certain period/average population in the same period×proportion base
Prevalence
The number of cases of a certain disease in a certain place during a certain period/the average population of the place in the same period×proportion base
Things to note when using relative indicators
1. Don’t confuse composition ratio with rate
2. When using relative numbers, the denominator should not be too small.
3. Calculate the total rate correctly
Add the numerators and denominators respectively (if the denominators are similar, you can divide directly)
4. Pay attention to the comparability of data
Use the standardization method to convert different compositions into standard compositions and then compare them.
5. Sampling error exists in sample rate or composition ratio
Perform hypothesis testing and statistical inference
hypothesis testing method
t-test
One-sample t-test (Single-sample mean t-test)
Applicable conditions: 1. The indicator is a quantitative indicator and obeys the normal distribution 2. Small sample
Used to test whether the population mean μ represented by the sample mean X is different from the known population mean μ₀
Paired sample mean t test (paired t-test)
Applicable conditions: 1. The indicator is a quantitative variable value 2. Each pair of difference values d obeys the normal distribution 3. Small sample
The essence is the one-sample t test comparing the difference sample mean d with the known population mean μᵈ=0
Two independent samples t test (Grouped t-test)
Applicable conditions: 1. The indicator is a quantitative variable value 2. There are two groups of samples, and the two groups of samples are independent 3. The two populations from which the two samples come respectively obey the normal distribution 4. The population of the two normally distributed populations The variances are equal (homogeneous variances) 5. Small sample
The two sample sizes n₁ and n₂ can be equal or different, and should be as equal as possible.
variance analysis (F test)
The basic idea is to decompose the total variation of all observed values into corresponding partial variations according to influencing factors. On this basis, calculate the statistic F value of the hypothesis test to achieve statistical inference on whether there is a difference in the overall mean.
If F≥Fα/2, then P≤α, reject H₀ and accept H₁, it can be recognized that the variances of the two populations are not equal; otherwise, the variances of the two populations are considered to be homogeneous.
Completely random design (ANOVA)
The basic steps
1. Establish a hypothesis: H₀: μᴀ=μʙ=μᴄ H₁: Not all the same or not all the same
2. Calculate and list the variance analysis table
3. Define the P value and draw conclusions
Difficulty: Calculation of divisions and combinations
Eliminate internal and external conflicts (between and within groups)
Pairwise comparison (multiple comparison)
q test (SNK method)
The MS error calculated in a completely random design is required before pairwise comparisons can be made
Parametric test
Known distribution type, test overall parameters (sensitive, high requirements)
Chi-square test (χ² test)
Applies to whether there is any difference between two or more overall rates or composition ratios The data is categorical variable data, that is, qualitative data
Four-table χ² test
2×2 (2 groups of observation objects, opposing 2 types of results)
Degree of freedom ν = (number of rows R-1) × (number of columns C-1)
The χ² value reflects the degree of agreement between the actual frequency and the theoretical frequency.
Applicable conditions: 1. When n≥40 and all T≥5, use the basic formula of χ² test or the special formula of χ² test for four-table data; 2. When n≥40 and 1≤T<5, use the correction formula of the four-table data χ² test; 3. When n<40 or T<1, use Fisher's exact probability method (exact probability method) with four tables of data.
Paired χ² test
Suitable for data whose sample size is not very large
1.b c≥40, basic formula 2.b c<40, correction formula
Nonparametric rank sum test
Scope of application: 1. Unknown or non-normal distribution 2. Graded data 3. No definite values at both ends of the data
Rank Sum Test (Wilcoxon)
Basic steps: 1. Establish a test hypothesis and determine the test level 2. Compile the rank sum (sum of ranks) and combine the rank sum statistic 3. Determine the P value and make an inference
statistical inference
Parameter Estimation
sampling error
The difference between a sample statistic and a population parameter caused by sampling
⑴Individual differences exist, that is, each X– is different from one another ⑵The error of random sampling, that is, X– is different from μ
standard error of the mean (absolute sampling error)
Standard deviation reflects the variation between sample means
σₓ₋=σ/√n, sₓ₋=s/√n
The smaller the standard error, the more accurate the estimate is.
The sample mean X– also obeys the normal distribution, that is, the population mean of X– is still μ, The standard deviation of the sample mean is σ√n
Parameter Estimation
refers to estimating population parameters from sample statistics
Estimation method
point estimate
It is to use a single value directly as an estimate of the overall parameter
The influence of sampling error is not considered and its accuracy cannot be evaluated.
interval estimate
Refers to calculating an interval based on a pre-given probability so that it can contain unknown overall parameters.
The probability 1-α given in advance is called credibility (usually 0.95 or 0.99), The calculated interval is called a credible interval or confidence interval
Two elements of a confidence interval
1. Credibility 1-α
reflect accuracy
2. Accuracy
The width of the interval reflects the precision. The narrower the interval, the more accurate the estimate.
Sampling error distribution rules (interval estimate of the population mean)
(1) Z distribution
Applicable conditions: 1. Large sample, n≥50 2. σ is known
Function: Reflect the sampling error distribution rules or sampling distribution rules of the sample mean of a large sample
z=X–-μ/σ√n
(2) t distribution (relative sampling error)
Applicable conditions: 1. Small sample, n<50 2. Unknown σ (in quantitative variables)
The greater the degree of freedom ν, the closer the t distribution curve is to the standard normal distribution curve.
t falls within 95%
(X–-1.96σₓ₋, X– 1.96σₓ₋)
hypothetical test
Also known as significance test, it is another important part of statistical inference. Its purpose is to qualitatively compare whether there is any difference between the overall parameters or whether the overall distribution is the same.
The basic steps
(1) Establish hypotheses and determine test levels
Null hypothesis/null hypothesis/null hypothesis [H₀]
"Negative" result corresponds to "equal sign formula"
Alternative hypothesis/counter hypothesis [H₁]
"Positive" result corresponds to "inequality formula"
(2) Select test methods and calculate test statistics
Calculate the P value based on the method test statistic value
The smaller P is, the more reason to reject H₀
(3) Make statistical inferences based on P value
If H₀:X–≠μ is accepted, it is due to sampling error
If H₀ is not accepted, then H₁ is not rejected: X–≠μ₂, which is due to the essential difference
Notice!
1. The hypothesis is for the general population
2. Take H₀ as the center, but H₀ and H₁ are indispensable.
3.H₀Usually the content is a certain state
4. Settings for one-sided and two-sided hypothesis testing
Calibration level
Also known as the significance level, represented by α, it is the probability value of the predetermined rejection region. In practice, α=0.05 or α=0.01 is generally used.
Three elements
①According to the information provided by the sample (i.e., the statistical descriptive indicators of the sample)
②Based on specific sampling error distribution rules
③With a certain probability (usually 95%)
Normal distribution and medical reference value range
normal distribution
Determined by two parameters
μ is a position parameter that describes the mean level of the normal distribution
Determine where the normal distribution lies on the X-axis
σ is a shape parameter that describes the degree of variation of the normal distribution.
Determine the distribution shape of the normal curve
area law
①The area under the curve is the probability
②The total area under the curve is 1 or 100%
③All normal curves have the same area within the range of any multiple of the same standard deviation around μ
standard normal distribution
μ=0,σ=1
Standardized Transformation of Random Variables
z=X-μ/σ
Medical reference value range
For all individual observation values obtained from the selected reference population, percentile limits are established using statistical methods, and the fluctuation range of the individual observation values is obtained. Typically the 95% reference range is used.
significance
1. As a reference index for clinically determining normality and abnormality
2. Can be used to evaluate children’s developmental level
Precautions
1. Determine a homogeneous reference population
2. Select a sufficient number of reference samples
3. Control detection errors
4. Select single and bilateral cutoffs
Some indicators are only abnormal if they are too large or too small
5. Choose an appropriate percentage range
6. Select the method for calculating the reference value range
Be proficient in formulas and calculation processes
repair
Normalized rate
system error
random measurement error
contradiction
Credibility ↑, the wider the confidence interval
Sample size ↑, the narrower the confidence interval
dynamic sequence
1. Concept: A series of statistical indicators that describe something in a certain time sequence (It can be an absolute number, a relative number or an average number) Arrange them in order and observe and compare.
2. Function: ① Calculate three indicators and statistically describe qualitative data; ②Use the average development speed to predict future occurrences (premise: V future = V now)
Representation of data
consistent with normal distribution
(X–±s²)
(Mean > Variance)
Does not conform to normal distribution
M(P₂₅,P₇₅)