AP Statistics Cheat Sheet
This page is a high-yield AP Statistics reference for last-minute review. It pulls together vocabulary, conditions, and formulas across Units 1–9 in one scan-friendly layout. On exam day you also have the official formula sheet and tables—use this page to remember when to apply each idea and how to phrase conclusions in context.
VERY Important Terms
- Parameter (population): \(\mu\), \(p\), \(\sigma\), \(\beta\): fixed, usually unknown. Parameters are always GREEK letters!
- Statistic (sample): \(\bar{x}\), \(\hat{p}\), \(s\), \(b\): computed from data. Parameters are always ROMAN (basically English) letters!
Unit 1: Exploring Univariate Data
Graphs and shape
- Histogram / stemplot: shape (skew left/right, symmetric), gaps, unusual values.
- Skew: tail direction = skew direction (long tail right → skewed right).
- Boxplot: median, IQR, outliers (often \(Q_1 - 1.5\,\text{IQR}\) and \(Q_3 + 1.5\,\text{IQR}\) rule).
Center and spread
- Center: Mean \(\bar{x}\): pulled by outliers (average of all data points); median resistant (the “middle” data point).
- Spread: Standard deviation \(s\): typical distance from mean (same units as data); IQR \(= Q_3 - Q_1\): spread of middle half; resistant to outliers
- When comparing two distributions, alwasy use CUSS: Center, Unusual features, Shape, and Spread in context.
Transforming Distributions
- Mean: Suppose you have \(y=f(x)\). Then \(\mu_y = \mu_f(x)\). Do any operation to combine two variables, the mean will be the same operation done on the two respective means.
- Standard Deviation: Suppose you have \(y=f(x) + a\), where \(f(x) does not contain any addition of constants\). Then, \(\sigma_y=\sigma_f(x)\). Also, if you add/subtract two variables, the new standard deviation becomes: \(\sigma_a,b = \sqrt{\sigma_a^2 + \sigma_b^2}\)
Unit 2: Exploring Bivariate Data
Scatterplots
- Describe form (linear/curved), direction (positive/negative), strength (weak/moderate/strong), and unusual points (outlier). Always describe in context!
Correlation \(r\)
- Measures linear association for quantitative variables; \(-1 \le r \le 1\).
- Not resistant to outliers; NEVER cite causation from correlation alone.
Least-squares regression line
\[\hat{y} = a + bx\]- Slope \(b\): predicted change in \(\hat{y}\) when \(x\) increases by 1 (in \(y\) units per \(x\) unit).
- Intercept \(a\): predicted \(\hat{y}\) when \(x=0\) (only meaningful if \(x=0\) is plausible).
- Residual \(e = y - \hat{y}\); residual plot checks linearity and equal spread. If residuals are reasonable and distributed randomly, then a linear trend is a good fit.
Influential vs outlier
- Outlier in \(y\): large residual.
- Influential point: pulling the regression line; often extreme in \(x\).
Unit 3: Collecting Data
Scope of inference
- Random sample → generalize to population that was sampled from
- Random assignment → supports cause (for differences in treatment)
Study types
- Observational: treatments not imposed; confounding possible, and thus causation CANNOT be established.
- Experiment: researcher assigns treatments; can show causation if well designed.
- Tip: Always say “This study is an [experiment/observation] because the researchers [did/did not] impose treatments.
Sampling designs
- Simple Random Sample: Take a population, assign each a number, and use a random number generator to select your desired amount of samples.
- Stratified Random Sample: Take a population, split into groups based on a characteristic, and employ a simple random sampling method to randomly choose the desired amount of samples from EACH group.
- Clustered Random Sample: Take a population, split into groups based on a characteristic, and employ a simple random sampling method to randomly choose the desired amount of groups and sample ALL members of the selected groups.
- Bias sources: wording, nonresponse, voluntary response, convenience samples. Usually, random sampling reduces bias one way or another
Experiments
- Blocking: separate similar subjects first, then randomize within blocks.
- Control: The control group is the group that receives no treatment or a placebo (basically something that looks like the treatment but does nothing)
- Replication: Replication occurs if you can rerun the experiment again. It does NOT include cases where you run a treatment many times in one experiment.
- Random assignment: To get the best results, always try to use some sort of random assignment to assign patients to treatments
- Blinding: Blinding is when a party (either the patient or the deliverer of the treatment) does not know which treatment a patient is assigned to. Single blinding is when the patient does not know which treatment is given (usually done through placebos), and double blinding is when both parties don’t know, usually done where the treatments are assigned and then given to the unknowing deliverer.
Unit 4: Probability and Random Variables
Probability rules
- \[0 \le P(A) \le 1\]
- \[P(A^c) = 1 - P(A)\]
- \[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]
- Independent: \(P(A \mid B) = P(A)\) (if \(P(B)>0\)); useful form \(P(A \cap B) = P(A)P(B)\).
- Conditional: \(P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}\)
Discrete random variables
- Mean (expected value): \(\mu_X = \sum x\, P(X=x)\)
- Variance: \(\sigma_X^2 = \sum (x-\mu_X)^2 P(X=x)\) ; SD \(\sigma_X\).
Combining random variables
- \[\mu_{X \pm Y} = \mu_X \pm \mu_Y\]
- If independent: \(\sigma^2_{X \pm Y} = \sigma^2_X + \sigma^2_Y\) (note plus for both sum and difference).
Binomial
- Fixed \(n\) independent trials, same \(p\), success/failure outcomes.
- \(\mu_X = np\), \(\sigma_X = \sqrt{np(1-p)}\)
Calculator Trick: Use binompdf(n, p, x) for \(P(X=x)\) and binomcdf(n, p, x) for \(P(X≤x)\)
Geometric \(G(n,p)\)
- Number of trials \(k\) to get one success.
- \(\mu_X = \frac{1}{p}\), \(\sigma_X = \frac{\sqrt{1-p}}{p}\)
Calculator Trick: Use geometpdf(n, p, x) for \(P(X=x)\) and geometcdf(n, p, x) for \(P(X≤x)\)
Normal distribution
- Standardize: \(z = \dfrac{x-\mu}{\sigma}\)
- Use normalcdf / invNorm with proper \(\mu\), \(\sigma\) and boundaries.
Unit 5: Sampling Distributions
Core ideas
- A statistic varies from sample to sample; its distribution is the sampling distribution.
- Central Limit Theorem: for large \(n\), sampling distribution of \(\bar{x}\) is approximately Normal even if the population isn’t (Independence / Random + large \(n\)).
Means
- \[\mu_{\bar{x}} = \mu\]
- \(\sigma_{\bar{x}} = \dfrac{\sigma}{\sqrt{n}}\) (if \(\sigma\) known); use \(\dfrac{s}{\sqrt{n}}\) as standard error when estimating.
Proportions
- \[\mu_{\hat{p}} = p\]
- \[\sigma_{\hat{p}} = \sqrt{\dfrac{p(1-p)}{n}}\]
Conditions
- \(np \ge 10\) and \(n(1-p) \ge 10\) for Normal approximation to \(\hat{p}\).
- \(n \ge 30\) often cited for \(\bar{x}\), but prefer context + shape of population when \(n\) is small.
Inference: Shared Structure
Confidence interval
\[\text{statistic} \pm (\text{critical value}) \times (\text{standard error})\]- Interpret: “We are C% confident that the interval (interval) captures the true population parameter [in context].”
- Not: “C% chance the parameter is in this interval” for one computed interval.
Hypothesis test
- Parameter in words + symbols (define \(p\); \(\mu\); \(p_1-p_2\); \(\beta\)).
- Hypotheses: \(H_a\) matches the question; equality always in \(H_0\).
- Conditions / PLANTS-style checks (see each procedure).
- Test statistic + \(P\)-value (probability of a statistic at least as extreme as observed if \(H_0\) true).
- Conclusion linked to \(P\)-value + context.
Errors
- Type I: reject true \(H_0\) (false alarm); controlled by \(\alpha\).
- Type II: fail to reject false \(H_0\) (miss); related to power (which is \(1-\beta\)).
Unit 6: Inference for Proportions
One-sample \(z\) interval and test for \(p\)
- Standard error (estimate): \(\text{SE}_{\hat{p}} = \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\)
- \(z\) statistic: \(z = \dfrac{\hat{p} - p_0}{\sqrt{\dfrac{p_0(1-p_0)}{n}}}\) under \(H_0\).
Two-sample difference \(\hat{p}_1 - \hat{p}_2\)
- CI: \((\hat{p}_1-\hat{p}_2) \pm z^* \sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)
- Hypothesis test \(p_1-p_2\): pooled proportion \(\hat{p}_c = \dfrac{x_1+x_2}{n_1+n_2}\) for standard error under \(H_0: p_1=p_2\):
Conditions (short checklist)
- Random assignment or random sampling (as appropriate).
- 10%: \(n \le 0.10\,N\) when sampling without replacement (each group if two-sample).
- Large counts: successes & failures \(\ge 10\) each (for each group or for one-sample use \(np_0\) and \(n(1-p_0)\) when checking Normal approximation).
Unit 7: Inference for Means
One-sample \(t\) for \(\mu\)
\[t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}, \quad \text{df} = n-1\]Paired data
- Difference \(d_i = \text{before}_i - \text{after}_i\) (be consistent); analyze one sample of differences.
Two-sample \(t\) for \(\mu_1 - \mu_2\)
- Typically unpooled: \(\text{SE} = \sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}\) ; df use calculator/wizard or conservative approximate methods per course.
Conditions
- Random + 10% if sampling without replacement.
- Normality / \(n\): if \(n\) small, need roughly symmetric data without strong skew/outliers; CLT helps when \(n\) is large.
Unit 8: Chi-Square Tests
Expected counts (tables)
\[\text{expected} = \frac{(\text{row total})(\text{column total})}{\text{table total}}\]Test statistic
\[\chi^2 = \sum \frac{(\text{observed}-\text{expected})^2}{\text{expected}}\]Types
- Goodness-of-fit: one categorical variable; \(\text{df} = (\#\text{cells}) - 1\) (minus estimated params if any).
- Homogeneity: same distribution across populations / treatment groups.
- Independence / association: one sample classified two ways.
Conditions
- Random counts / random assignment as appropriate.
- Large expected counts: commonly all expected \(\ge 5\) (some contexts use a stricter rule).
Unit 9: Inference for Slopes (Linear Regression)
Model (thinking)
\[y = \beta_0 + \beta_1 x + \varepsilon\]- Slope test: \(H_0: \beta_1 = 0\) vs \(H_a: \beta_1 \ne 0\) (or one-sided).
\(t\) statistic for slope
\[t = \frac{b - \beta_1}{\text{SE}_b}, \quad \text{df} = n-2\]Conditions (LINE / PLUS-style memory hook)
- Linear relationship (check scatterplot + residual plot).
- Independent observations (random / 10%).
- Normal residuals roughly (especially for small \(n\)).
- Equal spread of residuals across \(x\).
Fast “Which Procedure?” Map
| Question type | Usually |
|---|---|
| One proportion | One-sample \(z\) |
| Compare two proportions | Two-sample \(z\) |
| One mean | One-sample \(t\) |
| Matched pairs | Paired \(t\) on differences |
| Two independent means | Two-sample \(t\) |
| Distribution fits a model | \(\chi^2\) goodness-of-fit |
| Same distribution across groups | \(\chi^2\) homogeneity |
| Association of two categorical vars | \(\chi^2\) independence |
| Linear relationship between \(x\) and quantitative \(y\) | \(t\)-test for slope \(\beta_1\) |
Most Common AP Statistics Mistakes
- Parameter vs statistic: mixing \(\mu\) and \(\bar{x}\), or \(p\) and \(\hat{p}\), in conclusions.
- Correlation ≠ causation; confounding in observational studies.
- Wrong SE: pooled vs unpooled proportions; forgetting \(\sqrt{n}\) in denominator.
- Hypotheses about \(\hat{p}\) or \(\bar{x}\) instead of \(p\) or \(\mu\).
- “Accept \(H_0\)” language — say fail to reject \(H_0\).
- \(P\)-value definition: mis-stating as \(P(H_0 \text{ true})\).
- Conditions skipped or checked as “met” without linking to the study’s randomness and sample size.
- Chi-square: wrong \(\text{df}\) or wrong “expected count” formula.
Exam-Day FRQ Mini-Checklist PLEASE CHANGE
- Name the procedure (e.g., two-sample \(t\) interval for \(\mu_1-\mu_2\)).
- Define parameters in plain language with context.
- Show conditions with numbers where needed (not just “\(n\) large”).
- Give formula with numbers plugged when asked to calculate.
- Conclusion: evidence sentence referencing \(P\)-value or interval in context, and whether it generalizes / suggests causality based on study design.