Cover
Start now for free Mock exam notes .pdf
Summary
# Descriptive and inferential statistics
This topic explores the fundamental distinction and applications of descriptive and inferential statistics in data analysis.
## 1. Descriptive and inferential statistics
Descriptive statistics are used to summarize and describe data, while inferential statistics are used to make conclusions or predictions about a population based on a sample [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 1.1 Descriptive statistics
Descriptive statistics provide a concise summary of the main features of a dataset. These methods help in understanding the characteristics of the data without making broader generalizations [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Measures of Central Tendency:** These describe the center of the data.
* **Mean:** The average of all data points. It is sensitive to outliers [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Median:** The middle value in a sorted dataset. It is less affected by outliers than the mean [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Mode:** The most frequent value in the dataset.
* **Measures of Dispersion:** These describe the spread or variability of the data.
* **Range:** The difference between the maximum and minimum values.
* **Standard Deviation (SD):** A measure of the amount of variation or dispersion of a set of values.
* **Graphical Representations:** Various plots and charts are used to visualize data.
* Graphs, plots, and charts can visually represent data distributions and relationships [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 1.2 Inferential statistics
Inferential statistics involve using data from a sample to draw conclusions or make predictions about a larger population. This allows researchers to generalize findings beyond the immediate data collected [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Key Techniques:**
* **T-tests:** Used to compare the means of two groups [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Chi-squared tests:** Used for categorical data.
* **Confidence Intervals (CI):** Provide a range of values within which the true population parameter is likely to lie [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Regression Analysis:** Used to model the relationship between variables and make predictions [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Hypothesis Testing:** A formal procedure to test a claim about a population parameter using sample data [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 1.2.1 Estimating population parameters
With a sample mean, standard error, a sample size greater than 30, and assuming normal distribution, one can estimate the population mean [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **68% certainty:** The population mean falls within 1 standard error (SE) of the sample mean [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **95% certainty:** The population mean falls within 2 SE of the sample mean [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **99.7% certainty:** The population mean falls within 3 SE of the sample mean [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 1.2.2 Correlation
Correlation measures the strength and direction of the linear relationship between two factors [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Correlation Coefficient (r):** Ranges from -1 to +1.
* $r = +1$: Perfect positive linear association [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* $r = -1$: Perfect negative linear association [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Values closer to $\pm 1$ indicate a stronger relationship, while values closer to 0 indicate a weaker relationship [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **$R^2$:** The correlation coefficient squared, indicating the proportion of variation in the dependent variable explained by the independent variable [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Correlation does not imply causation** [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 1.2.3 Regression
Regression analysis uses the line of best fit to estimate variables within a linear relationship [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Linear Regression:** A statistical method modeling the relationship between a dependent and one or more independent variables by fitting a linear equation.
* **Goodness of Fit:** Assessed using metrics like $R^2$ and by examining residuals.
* **Residuals:** The difference between an actual data point and the value predicted by the model. Patterns in residual plots indicate that a linear equation may not be appropriate [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 1.2.4 Probability and Risk
Uncertainty is inherent in living systems and statistics [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Probability (P):** The proportion of times a specific outcome occurs in a large number of independent trials, scaled from 0 to 1 [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Risk:** The probability of undesirable things happening.
* **Mutually Exclusive Outcomes:** Outcomes that cannot occur at the same time; their probabilities sum to 1 [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Independent Events:** The outcome of one event does not affect the outcome of another; probabilities are multiplied [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
##### 1.2.4.1 Probability distributions
A probability distribution is a graphical representation of probabilities [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Binomial Distribution:** Applies to situations with two possible outcomes (success/failure) in a fixed number of independent trials with a constant probability of success [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* The probability of observing ONE outcome is given by $dbinom(q, size, prob)$ [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* The cumulative probability of a range of outcomes is calculated using $pbinom(q, size, prob)$ [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Discrete vs. Continuous Data:**
* **Discrete data:** Often follows a binomial distribution (e.g., counts).
* **Continuous data:** Typically follows a normal distribution (e.g., measurements).
### 1.3 Hypothesis testing
Hypothesis testing is a statistical method used to evaluate claims about a population parameter [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Null Hypothesis ($H_0$):** A statement of no effect or no difference, assumed to be true [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Alternative Hypothesis ($H_A$):** A statement that contradicts the null hypothesis, representing the trend or effect being investigated [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Significance Level ($\alpha$):** The probability of rejecting the null hypothesis when it is actually true (Type I error). A common $\alpha$ is 0.05 [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **P-value:** The probability of observing data as extreme as, or more extreme than, the observed data, assuming the null hypothesis is true [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* If $p < \alpha$, reject $H_0$ [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Critical Region:** The range of values for which the null hypothesis is rejected [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Type I Error (False Positive):** Rejecting $H_0$ when it is true [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Type II Error (False Negative):** Failing to reject $H_0$ when it is false [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Power:** The probability of correctly rejecting $H_0$ when it is false ($1 - \beta$) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 1.3.1 Types of error bars
The interpretation of error bars depends on whether they represent Standard Deviation (SD), Standard Error (SE), or 95% Confidence Intervals (CI) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
| Error Bar Type | If Bars Overlap | If Bars Do Not Overlap |
| :---------------------- | :-------------------------------------------- | :---------------------------------- |
| Standard Deviation (SD) | No conclusion about significance | No conclusion about significance |
| Standard Error (SE) | Difference is likely not significant (P > 0.05) | No definite conclusion |
| 95% Confidence Interval | No conclusion about significance | Difference is likely significant (P < 0.05) |
---
# Correlation and regression analysis
This section explores the relationship between two factors, including how to interpret the strength and direction of correlation coefficients and the application of regression analysis to model linear relationships [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 2.1 Correlation
Correlation quantifies the strength and direction of the linear relationship between two variables [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.1.1 Correlation coefficient (r)
* The correlation coefficient, denoted by '$r$', ranges from -1 to +1 [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A positive '$r$' indicates a positive linear association: as one variable increases, the other tends to increase [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A negative '$r$' indicates a negative linear association: as one variable increases, the other tends to decrease [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Values closer to +1 or -1 indicate a stronger relationship [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Values closer to 0 indicate a weaker relationship [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.1.2 Interpretation of correlation strength
| Correlation Coefficient Range | Descriptive Term |
| :-------------------------- | :--------------- |
| 0.0 - 0.2 | Very weak, negligible |
| 0.2 - 0.4 | Weak, low |
| 0.4 - 0.7 | Moderate |
| 0.7 - 0.9 | Strong, high, marked |
| 0.9 - 1.0 | Very strong, very high |
#### 2.1.3 Coefficient of Determination ($R^2$)
* $R^2$ is the square of the correlation coefficient [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* It represents the proportion of the variance in the dependent variable (y) that is explained by the independent variable (x) [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A larger $R^2$ value indicates that the linear model is a good fit for the data [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.1.4 Key Principle
**Correlation does not imply causation.** [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 2.2 Regression analysis
Regression analysis is a statistical method used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the data [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.2.1 Linear Regression Equation (Line of Best Fit)
* The primary goal of linear regression is to find the line that best represents the linear relationship between variables [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* This line can be used to estimate the value of the dependent variable for a given value of the independent variable [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.2.2 Assessing the Goodness of Fit
Before using a regression line, it is crucial to assess how well it fits the data [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **$R^2$ value**: As mentioned above, this indicates the proportion of variance explained by the model [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Residuals**: These are the differences between the actual data points and the values predicted by the regression model [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* If there is a discernible pattern in the residual plots, it suggests that a linear model may not be appropriate for the data [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Aspects to check in residual plots**:
1. **Even distribution**: Points should be evenly distributed vertically and horizontally around the zero line. If points in the top half are further from zero than those below, the model may not accurately predict values [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
2. **Outliers**: Identify any data points that are not well predicted by the model, as these can indicate that the model might not be trustworthy [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
3. **Clear shape**: A linear model is not appropriate if there is a clear, non-linear pattern (e.g., points at one end are close, and points at the other end spread further away) [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 2.2.3 Prediction and Confidence Intervals
* Regression models can be used to predict values and calculate 95% confidence intervals for those predictions [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A stronger correlation generally leads to more confident predictions and a more accurate estimate due to less variability around the line of best fit [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
> **Tip:** A higher correlation coefficient leads to greater confidence in predictions made using the regression line [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
> **Example:** In R, a linear model can be built using `lm(dependent_variable ~ independent_variable, data=your_data_frame)` and predictions can be made using `predict()` [1](#page=1) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
---
# Probability and statistical distributions
This topic explores the fundamental concepts of probability, the quantification of risk, and the characteristics and applications of various probability distributions.
### 3.1 Probability and risk
Probability is defined as the proportion of times a specific outcome occurs in a large number of independent trials, with a scale from 0 (impossible) to 1 (certain). Risk, on the other hand, is the probability of undesirable events happening [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.1.1 Rules of probability
* **Mutually Exclusive Outcomes:** If outcomes cannot happen at the same time, their probabilities are added. The sum of probabilities for all mutually exclusive outcomes equals 1. For example, the probability of rolling a 1 or a 4 on a single die roll is $1/6 + 1/6 = 2/6$ [1](#page=1) [2](#page=2) [3](#page=3).
* **Independent Events:** If the outcome of one event does not affect the outcome of another, their probabilities are multiplied. This applies to calculating the probability of one outcome AND another occurring [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.1.2 Probability distributions
A probability distribution graphically represents the probability of different outcomes. It is a theoretical representation of the likelihood of each possible outcome, which can be compared to an observed frequency distribution [1](#page=1) [2](#page=2) [3](#page=3).
### 3.2 Binomial distribution
The binomial distribution is used for situations with a fixed number of independent trials, each having only two possible outcomes: "success" and "failure" [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.2.1 Conditions for binomial distribution
For a binomial distribution to apply, the following conditions must be met [1](#page=1) [2](#page=2) [3](#page=3):
* There are exactly two outcomes for each trial: success and failure.
* The number of trials is fixed.
* Each trial is independent of the others.
* The probability of success ($p$) is the same for every trial.
#### 3.2.2 Calculating binomial probabilities
* **Probability of a specific outcome ($dbinom$):** This function calculates the probability of observing exactly $k$ successes in $n$ trials, with a probability of success $p$.
* Example: The probability of getting exactly 2 heads from 4 coin tosses (where $p=0.5$) is calculated as $dbinom(2, 4, 0.5) = 0.375$ [1](#page=1) [2](#page=2) [3](#page=3).
* **Cumulative Probability ($pbinom$):** This calculates the probability of observing up to a certain number of successes. The function $pbinom(q, size, prob)$ calculates the probability of $q$ or fewer successes in $size$ trials with a probability of success $prob$ [1](#page=1) [2](#page=2) [3](#page=3).
* Example: The probability of getting up to and including 2 heads from 4 coin tosses is $pbinom(2, 4, 0.5) = 0.6875$ [1](#page=1) [2](#page=2) [3](#page=3).
* To find the probability of 3 or more heads, you would subtract the cumulative probability of 2 or fewer heads from 1: $1 - pbinom(2, 4, 0.5)$.
* **Finding the number of items within a range:** Once the probability ($p$) of an event occurring within a specific range is calculated, the number of occurrences can be estimated by multiplying the probability by the total number of trials: Number of items = $p \times \text{total number of trials}$ [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 3.2.3 Discrete vs. Continuous Data
* **Discrete Data:** Binomial distribution is suitable for discrete data, and functions like $dbinom$ and $pbinom$ are used [1](#page=1) [2](#page=2) [3](#page=3).
* **Continuous Data:** For continuous data that is normally distributed, functions like $pnorm$ are used [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.2.4 The Birthday Paradox
This is a classic probability problem illustrating how quickly the probability of shared birthdays increases in a group. The calculation is often simplified by finding the probability that NO two people share a birthday and subtracting that from 1 [1](#page=1) [2](#page=2) [3](#page=3).
### 3.3 Normal distribution
The normal distribution, also known as the Gaussian distribution or bell curve, is a continuous probability distribution that is symmetric about its mean. Many natural phenomena approximate this distribution [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.3.1 Properties of the normal distribution
* **Empirical Rule (68-95-99.7 rule):** For a normal distribution:
* Approximately 68% of the data falls within one standard deviation (SD) of the mean [1](#page=1) [2](#page=2) [3](#page=3).
* Approximately 95% of the data falls within two standard deviations of the mean [1](#page=1) [2](#page=2) [3](#page=3).
* Approximately 99.7% of the data falls within three standard deviations of the mean [1](#page=1) [2](#page=2) [3](#page=3).
#### 3.3.2 Estimating population parameters
If you have a sample mean, standard error, and a sample size of more than 30, and the data is normally distributed, you can estimate the population mean with a certain degree of certainty [1](#page=1) [2](#page=2) [3](#page=3).
### 3.4 Using R for probability calculations
* `dbinom()`: Calculates the probability mass function for the binomial distribution.
* `pbinom()`: Calculates the cumulative distribution function for the binomial distribution.
* `pnorm()`: Calculates the cumulative distribution function for the normal distribution.
### 3.5 Descriptive vs. Inferential Statistics
* **Descriptive Statistics:** Summarize and describe data using measures like mean, median, mode, range, standard deviation, and graphical representations [1](#page=1) [2](#page=2) [3](#page=3).
* **Inferential Statistics:** Make conclusions or predictions about a population based on a sample, using methods like t-tests, chi-squared tests, confidence intervals, regression analysis, and hypothesis testing [1](#page=1) [2](#page=2) [3](#page=3).
### 3.6 Key statistical concepts
* **Correlation Coefficient ($r$):** Measures the strength and direction of a linear relationship between two variables, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation). A value close to 0 indicates a weak relationship [1](#page=1) [2](#page=2) [3](#page=3).
* **Coefficient of Determination ($R^2$):** The square of the correlation coefficient, indicating the proportion of variance in the dependent variable explained by the independent variable [1](#page=1) [2](#page=2) [3](#page=3).
* **Outliers:** Data points that fall significantly outside the general pattern of the data. When outliers are present, the median is often a more robust measure of central tendency than the mean [1](#page=1) [2](#page=2) [3](#page=3).
* **Skewness:**
* **Left Skew:** The mean is typically less than the median [1](#page=1) [2](#page=2) [3](#page=3).
* **Right Skew:** The mean is typically more than the median [1](#page=1) [2](#page=2) [3](#page=3).
* The greater the difference between the mean and median, the more skewed the distribution and the more pronounced the effect of outliers [1](#page=1) [2](#page=2) [3](#page=3).
### 3.7 Important Equations and Formulas
* **Standard Error of the Mean (SEM):**
$$ \text{SEM} = \frac{\text{Standard deviation}}{\sqrt{n}} $$
where $n$ is the sample size [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Outlier Calculation:** Values outside the range of $Q1 - (1.5 \times \text{IQR})$ and $Q3 + (1.5 \times \text{IQR})$ are considered outliers, where IQR is the Interquartile Range ($Q3 - Q1$) [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Cohen's d (Effect Size):**
$$ \text{Cohen's } d = \frac{X_1 - X_2}{\text{Pooled SD}} $$
where $X_1$ and $X_2$ are the means of two groups, and Pooled SD is calculated as:
$$ \text{Pooled SD} = \sqrt{\frac{(\text{SD}_1^2 + \text{SD}_2^2)}{2}} $$
For more than two samples, the denominator involves the total sample size [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Probability of False Positive (Type I Error) in multiple tests:**
$$ P(\text{at least one false positive in } m \text{ tests}) = 1 - (1 - \alpha)^m $$
where $\alpha$ is the significance level of a single test [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
---
# Hypothesis testing and error types
Hypothesis testing is a statistical framework used to make decisions about a population based on sample data, involving the formulation of competing hypotheses and the evaluation of evidence against them, while acknowledging the possibility of errors [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 4.1 The principles of hypothesis testing
Hypothesis testing involves setting up two mutually exclusive statements about a population: the null hypothesis and the alternative hypothesis [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.1.1 Null and alternative hypotheses
* **Null Hypothesis ($H_0$)**: This is the default assumption that there is no effect, no difference, or no relationship in the population. It is assumed to be true until sufficient evidence suggests otherwise. For example, $H_0$: a coin is not biased to heads [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Alternative Hypothesis ($H_A$ or $H_1$)**: This is the statement that contradicts the null hypothesis, suggesting there is an effect, difference, or relationship. It represents what the researcher is trying to find evidence for. For example, $H_A$: the coin is biased to heads [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
The process involves assuming the null hypothesis is true and then assessing the likelihood of observing the sample data if it were true [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.1.2 Significance level ($\alpha$)
The significance level, denoted by $\alpha$, is a pre-determined threshold for rejecting the null hypothesis. It represents the probability of making a Type I error (false positive). A common significance level is $0.05$ (or 5%) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* If the calculated p-value is less than $\alpha$, the null hypothesis is rejected [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* If the p-value is greater than or equal to $\alpha$, the null hypothesis is not rejected [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.1.3 Critical region
The critical region is the area of the probability distribution that leads to the rejection of the null hypothesis. The threshold value, determined by the significance level ($\alpha$), separates the critical region from the region where the null hypothesis is accepted. For a two-tailed test with $\alpha = 0.05$ and 100 trials, the critical values might be 40 and 60, meaning that if the observed number of successes falls outside this range (e.g., less than 40 or more than 60), the null hypothesis would be rejected [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.1.4 P-value
The p-value is the probability of obtaining observed results (or more extreme results) assuming that the null hypothesis is true. It is compared to the significance level ($\alpha$) to decide whether to reject the null hypothesis. The p-value does not measure the effect size or confirm the truth of the hypothesis, but rather its compatibility with the null hypothesis [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 4.2 Types of errors in hypothesis testing
When conducting hypothesis tests, there are two primary types of errors that can occur, reflecting incorrect conclusions about the null hypothesis [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.2.1 Type I error (false positive)
A Type I error occurs when the null hypothesis ($H_0$) is rejected when it is, in fact, true. This is also known as a false positive. The probability of making a Type I error is equal to the significance level ($\alpha$). Sample size does not affect the probability of a Type I error [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Example**: Concluding that a new drug is effective (rejecting $H_0$ of no effect) when it actually has no effect [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.2.2 Type II error (false negative)
A Type II error occurs when the null hypothesis ($H_0$) is not rejected when it is, in fact, false. This is also known as a false negative. The probability of making a Type II error is denoted by $\beta$. An increased sample size can reduce the likelihood of a Type II error [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Example**: Concluding that a new drug is not effective (failing to reject $H_0$ of no effect) when it actually is effective [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
| Scenario | Null Hypothesis True | Null Hypothesis False |
| :-------------------------- | :------------------- | :-------------------- |
| **Reject $H_0$** | Type I Error ($\alpha$) | Correct Decision |
| **Do not reject $H_0$** | Correct Decision | Type II Error ($\beta$) |
### 4.3 Statistical power
Statistical power is the probability that a test will correctly reject a false null hypothesis, thereby avoiding a Type II error. It is calculated as $Power = 1 - \beta$ [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A power of 80% means that if a true effect exists, there is an 80% chance the study will detect it [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Higher power increases the reliability of the study and the certainty of its findings [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Low power means there is a significant risk of missing a real effect (Type II error) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.3.1 Factors that increase power
* **Larger sample size**: A bigger sample size generally leads to higher power [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Larger effect size**: A more pronounced true effect is easier to detect, thus increasing power [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Higher significance level ($\alpha$)**: While this increases power, it also increases the risk of a Type I error [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **One-tailed test**: When justified, a one-tailed test can increase power compared to a two-tailed test [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Lower variance**: Less variability in the data makes it easier to detect a true effect, leading to higher power [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 4.3.2 Effect size
Effect size quantifies the magnitude or practical importance of a statistical difference or relationship. While a p-value indicates statistical significance, effect size tells us if the observed effect is practically meaningful in the real world [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* A study can yield a small p-value with a large sample size even if the effect size is small, indicating statistical significance but little practical importance [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Common measures of effect size include Cohen's d, correlation coefficient (r), and $R^2$ [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Factors that increase effect size include a larger true difference, lower variability, less measurement error, and well-controlled experimental designs [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
---
# Statistical tests and assumptions
This section covers fundamental statistical tests, their underlying assumptions, and how to interpret their outputs, including post-hoc analyses [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 5.1 Overview of statistical concepts
Inferential statistics are used to draw conclusions or make predictions about a population based on a sample. Hypothesis testing is a core component of inferential statistics [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Null Hypothesis ($H_0$)**: Asserts no bias or effect; everything is fair [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Alternative Hypothesis ($H_A$)**: Asserts a specific trend or bias exists; it is mutually exclusive with $H_0$ [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Significance Level ($\alpha$)**: The probability of making a Type I error (false positive), typically set at 0.05 [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **P-value**: The probability of observing results as extreme as, or more extreme than, the current data, assuming the null hypothesis is true. If $P < \alpha$, $H_0$ is rejected [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Critical Region**: The range of outcomes considered unlikely under the null hypothesis, leading to its rejection [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Effect Size**: Measures the practical importance or magnitude of a statistical difference or relationship. It indicates how meaningful a finding is in the real world [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 5.2 Common statistical tests: t-tests and ANOVA
#### 5.2.1 T-tests
T-tests are statistical tools used to evaluate if there's a significant difference between two samples. They utilize the mean, standard deviation (SD), and number of independent observations from a sample to estimate its representation of the population [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Two-sample t-test**: Compares the means of two independent groups [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **One-sample t-test**: Compares the mean of a single group to a known or hypothesized population mean [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Paired t-test**: Applied when samples are related or measured in pairs, such as before and after an intervention [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
##### 5.2.1.1 Assumptions of the t-test
For t-tests to yield accurate probability calculations, several assumptions about the data must be met. Violating these assumptions can increase the likelihood of Type I (false positive) or Type II (false negative) errors [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
1. **Variable types**: The dependent variable must be continuous, and the independent variable must be bivariate (having only two categories). For example, testing how diet (normal vs. western, a bivariate independent variable) affects running time (a continuous dependent variable) in mice [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
2. **Normality**: The population from which the samples are drawn should have a normal distribution. This can be assessed using a normal quantile-quantile (Q-Q) plot, where data points forming a straight line indicate a good fit to a normal distribution [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
3. **Equal variances (homoscedasticity)**: The spread (variance) of data in the two populations being compared should be similar. This can be estimated by checking the ratio of the larger variance to the smaller variance; if this ratio is less than 4, variances are often considered equal. This is an estimate and may not be reliable for small samples [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 5.2.2 Analysis of Variance (ANOVA)
ANOVA (Analysis of Variance) is an F-test used to compare means across three or more groups. It works by comparing the variance within different samples to the variance between samples. The outcome of ANOVA indicates whether the observed samples are likely to originate from the same population [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
##### 5.2.2.1 Performing ANOVA in R
To perform an ANOVA test in R, the following assumptions should be met [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7):
* Data should be normally distributed [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Observations should be independent within and between groups [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Groups must have equal variances (homoscedasticity) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
The process involves using the `aov()` function to run the ANOVA test and `summary()` to view the output [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
##### 5.2.2.2 Interpreting ANOVA Output
The ANOVA output includes:
* **Sum of Squares**: Represents the squared differences between data points and the overall mean (SST) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Mean Squares**: Calculated by dividing the sum of squares by the degrees of freedom (DF) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **F-statistic**: The ratio of mean squares between groups to mean squares within groups. A high F-statistic suggests that variations between groups are statistically significant, leading to the rejection of $H_0$ [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **P-value**: Similar to t-tests, this indicates the probability of observing the results if $H_0$ were true.
* **Degrees of Freedom (DF)**:
* Between groups: $K - 1$, where $K$ is the number of groups [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* Within groups: $N - K$, where $N$ is the total number of observations across all groups [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
ANOVA results are typically reported as $F(\text{DF between}, \text{DF within}) = F\text{-value, } P = \text{p-value}$ [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 5.2.3 Post-hoc tests
When an ANOVA yields a significant result ($P < \alpha$), it indicates that at least one group mean differs from the others, but it doesn't specify which groups differ. Post-hoc tests are performed to determine these specific differences [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Tukey's Honestly Significant Difference (HSD) test**: A common post-hoc test that compares all possible pairs of group means. It provides the differences between conditions and an adjusted p-value for each comparison. If the adjusted p-value is below the significance level, the means of those two groups are considered significantly different [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 5.3 Handling multiple comparisons
When performing multiple statistical tests, the probability of encountering a false positive (Type I error) increases.
* **Family-wise Error Rate (FWER)**: The probability of making at least one Type I error across a series of tests. As the number of tests ($m$) increases, the FWER also increases: $P(\text{at least one false positive}) = 1 - (1-\alpha)^m$ [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Bonferroni Correction**: A conservative method that divides the significance level ($\alpha$) by the number of tests ($m$) to control the FWER. It reduces the chance of false positives but increases the risk of false negatives (Type II errors), thus decreasing statistical power [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **False Discovery Rate (FDR)**: Controls the expected proportion of "discoveries" (rejected null hypotheses) that are actually false positives. The Benjamini-Hochberg (BH) method is commonly used for FDR control and is less conservative than Bonferroni, offering increased power, especially for a large number of tests. Adjusted p-values below 0.05 under FDR control imply that approximately 5% of these discoveries are expected to be false positives [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 5.4 Error and bias in experimental design
Understanding and minimizing error and bias is crucial for reliable research [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Sampling Error**: Arises when a sample does not perfectly represent the population. It can be estimated and is ideally normally distributed. Techniques to control sampling error include [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7):
* **Replication**: Repeating measurements or experiments under identical conditions [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Balance**: Using groups of equal size, as unequal sizes can affect power and variance [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Blocking**: Grouping similar experimental units before random assignment to treatments to reduce variability [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Bias**: A systematic error that distorts results, stemming from study design, data collection, data analysis, or publication practices. Techniques to control bias include [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7):
* **Simultaneous Control Groups**: Using negative (no effect expected) and positive (effect expected) controls, as well as potentially a "best available therapy" control [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Blinding**: Preventing researchers and/or participants from knowing group assignments to avoid expectation bias [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Randomization**: Randomly assigning subjects to groups to minimize systematic differences [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 5.5 Interpreting statistical outputs
#### 5.5.1 Error bars
Different types of error bars convey different information:
* **Standard Deviation (SD) Error Bars**: Represent the spread of data within a sample. Overlapping SD bars suggest no conclusion about significance can be drawn [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Standard Error (SE) Error Bars**: Indicate the accuracy of the sample mean as a representation of the population mean. SE bars are generally smaller than SD bars and are often used to calculate confidence intervals. Overlap between SE bars does not necessarily mean a lack of significance, but non-overlap suggests a likely significant difference [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **95% Confidence Interval (CI) Error Bars**: Describe the range within which the true population mean is likely to fall (with 95% confidence). If the 95% CIs for the difference between two means do not include zero, it suggests a statistically significant difference at the 0.05 level. CIs can overlap significantly (up to 50%) and still indicate a significant difference, with touching CIs suggesting high significance [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 5.5.2 T-test and ANOVA output interpretation
* **T-test Output**: The 95% CI often pertains to the difference between the means. If this interval does not contain 0, it supports a significant result, implying the population mean is unlikely to be 0 and an effect exists [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **ANOVA Output**: A significant p-value below the chosen alpha level indicates a difference between at least two group means. Post-hoc tests are then required to identify which specific groups differ [2](#page=2) [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
> **Tip:** Always check the assumptions of any statistical test before interpreting its results. Violating assumptions can lead to misleading conclusions.
---
# Experimental design and error control
Robust experimental design is crucial for obtaining reliable and valid scientific results, focusing on minimizing both sampling error and bias [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 6.1 Principles of experimental design
An experimental study involves an intervention to test a hypothesis, in contrast to an observational study which involves making analyses without interventions. In experimental studies, all variables except the one being tested are controlled [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Independent/Explanatory Variable:** The variable that is manipulated or changed by the researcher, which is hypothesized to cause an effect [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Dependent/Response Variable:** The variable that is measured and is expected to be affected by the independent variable [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Confounding Variable:** A variable that can influence both the independent and dependent variables, potentially distorting the observed relationship. Awareness of confounding variables is important for interpreting experimental results [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 6.2 Types of replicates
* **Technical replicates:** Multiple measurements taken from the exact same sample. These are used to assess the precision and reliability of the experimental technique itself [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Biological replicates:** Different samples that are biologically distinct but are subjected to the same experimental conditions. These account for natural biological variability and help ensure that observed effects are consistent across different biological entities [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 6.3 Types of controls
* **Negative control:** A condition where no effect is expected. It serves as a baseline for comparison [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Positive control:** A condition where an effect is known to occur. It is used to confirm that the experimental system is capable of detecting an effect [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
### 6.4 Error control techniques
Reducing error increases the reliability of experimental results. Errors in experiments can be categorized as sampling error and bias [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 6.4.1 Sampling error
Sampling error arises when the sample used in an experiment is not perfectly representative of the population. This error is typically normally distributed and can be estimated. Techniques to control sampling error include [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7):
* **Replication:** Increasing the number of independent subjects or measurements provides more data, leading to more accurate measurements and reducing the impact of random variation [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Balance:** Comparing groups of similar sizes is ideal. Unequal sample sizes can affect statistical power and increase variance, potentially reducing the reliability of detecting true effects [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Blocking:** Grouping experimental units with similar characteristics (e.g., age, sex, health status) and then randomly assigning treatments within each block. This helps to remove variation associated with these characteristics from the experimental error [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 6.4.2 Bias
Bias is a systematic error that leads to distorted or consistently inaccurate results. It can be introduced through various factors [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7):
* **Study design:** For example, only measuring the largest cells because they are easiest to access, which may not be representative of the entire cell population [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Data collection:** Equipment that consistently reads values higher or lower than the true value [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Data analysis:** Using an analysis model that systematically underestimates or overestimates measured values [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Publication bias:** The tendency to publish findings that align with expected outcomes, leading to an overrepresentation of positive or significant results in the literature [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
Techniques to control bias include:
* **Simultaneous control groups:** Using negative, positive, and best available therapy controls that are run concurrently with the experimental groups to provide valid comparisons [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Blinding:** Preventing participants, researchers, or analysts from knowing which treatment group subjects belong to. This minimizes the influence of expectations (like the placebo effect) on the results [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
> **Tip:** The placebo effect can be influenced by the administration method, dosage, appearance, and doctor's communication, underscoring the importance of blinding in clinical research [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
* **Randomization:** Assigning subjects to experimental groups randomly. This should ideally be performed by a computer to avoid conscious or unconscious human bias. If an experiment does not specify its randomization method, it may indicate a poorly designed study [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
#### 6.4.3 Questionable Research Practices (QRPs)
QRPs are not outright scientific misconduct but can lead to misleading conclusions and include practices like cherry-picking data, p-hacking, and HARKing (hypothesizing after results are known). These practices can arise from a lack of statistical understanding or the pressure to publish significant findings [3](#page=3) [4](#page=4) [5](#page=5) [6](#page=6) [7](#page=7).
---
## Common mistakes to avoid
- Review all topics thoroughly before exams
- Pay attention to formulas and key definitions
- Practice with examples provided in each section
- Don't memorize without understanding the underlying concepts
Glossary
| Term | Definition |
|---|---|
| Observational study | A study that makes observations that can be analyzed without any interventions being made. |
| Experimental study | A study where an intervention is made to test a hypothesis, and all other variables are controlled. |
| Independent variable | The variable that is changed or manipulated in an experiment to observe its effect. |
| Dependant variable | The variable that is measured or observed in response to changes in the independent variable. |
| Confounding variable | A variable that can impact the measurements of both the independent and dependent variables, potentially distorting the results. |
| Technical replicates | Multiple measurements taken from the same sample to assess the precision and reliability of an experimental technique. |
| Biological replicates | Using different samples that are biologically distinct but treated identically to account for natural biological variation. |
| Negative control | A condition in an experiment where no effect is expected, used as a baseline for comparison. |
| Positive control | A condition in an experiment where an effect is known to occur, used to compare against the experimental treatment. |
| Descriptive statistics | Statistical methods used to summarize and describe the main features of a dataset, such as mean, median, mode, and range. |
| Inferential statistics | Statistical methods used to make conclusions or predictions about a population based on a sample of data. |
| Correlation | A statistical measure that indicates the strength and direction of a linear relationship between two variables. |
| Correlation coefficient (r) | A value between -1 and +1 that quantifies the strength and direction of a linear relationship. |
| Regression analysis | A statistical method used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation. |
| R-squared ($R^2$) | A statistical measure that represents the proportion of the variance in the dependent variable that is predictable from the independent variable(s). |
| Residuals | The difference between an observed data point and the value predicted by a statistical model. |
| Probability | The measure of the likelihood that an event will occur, expressed as a number between 0 and 1. |
| Probability distribution | A graphical representation that shows the probability of each possible outcome of a random variable. |
| Binomial distribution | A probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials, each with the same probability of success. |
| Cumulative probability | The probability of a range of outcomes occurring, from the minimum possible value up to a specified value. |
| Discrete data | Data that can only take on a finite number of values or a countable number of values, often represented by whole numbers. |
| Continuous data | Data that can take on any value within a given range, with an infinite number of possibilities between any two values. |
| Null hypothesis ($H_0$) | A statement that there is no significant difference or relationship between variables, or that an observed effect is due to chance. |
| Alternative hypothesis ($H_A$) | A statement that contradicts the null hypothesis, proposing that there is a significant difference or relationship. |
| Significance level (alpha, $\alpha$) | The probability threshold used to determine whether to reject the null hypothesis. Commonly set at 0.05. |
| P-value | The probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. |
| Critical region | The range of values in a hypothesis test that leads to the rejection of the null hypothesis. |
| Type I error (False positive) | The error of rejecting the null hypothesis when it is actually true. |
| Type II error (False negative) | The error of failing to reject the null hypothesis when it is actually false. |
| Power | The probability of correctly rejecting the null hypothesis when it is false, indicating the test's ability to detect a true effect. |
| Effect size | A measure of the magnitude of a statistical relationship or difference between groups, indicating the practical significance of the findings. |
| Standard Deviation (SD) | A measure of the amount of variation or dispersion of a set of data values from their mean. |
| Standard Error of the Mean (SEM) | A measure of the variability of sample means around the population mean, calculated as $SD / \sqrt{n}$. |
| Confidence Interval (CI) | A range of values, derived from sample statistics, that is likely to contain the value of an unknown population parameter. |
| T-test | A statistical test used to compare the means of two groups to determine if they are statistically significantly different. |
| ANOVA (Analysis of Variance) | A statistical test used to compare the means of three or more groups to determine if there is a statistically significant difference between them. |
| Post-hoc test | Additional statistical tests performed after an ANOVA to determine which specific group pairs have significantly different means. |
| Tukey's honest significance test | A common post-hoc test used after ANOVA to perform all pairwise comparisons between group means. |
| Bonferroni correction | A method used to control the family-wise error rate when performing multiple statistical tests by adjusting the significance level. |
| False Discovery Rate (FDR) | The expected proportion of false positives among the rejected null hypotheses. |
| Sampling error | The error that arises from using a sample to represent a population, due to the inherent variability between samples. |
| Bias | A systematic error that leads to distorted results, consistently pushing measurements in a particular direction. |
| Replication | Repeating an experiment multiple times to increase the reliability and precision of the results. |
| Balance (in experimental design) | Designing experiments to have equal sample sizes in each group to minimize variance effects. |
| Blocking | Grouping experimental units with similar characteristics before random assignment to treatments to reduce variability. |
| Randomization | The process of assigning subjects to experimental groups by chance to minimize bias. |
| Blinding | A technique where participants or researchers are unaware of which treatment groups subjects are assigned to, to prevent expectation bias. |
| Placebo effect | A beneficial effect produced by a placebo drug or treatment that cannot be attributed to the properties of the placebo itself, and must therefore be due to the patient's belief in that treatment. |
| Questionable Research Practices (QRPs) | Statistical or analytical methods used to produce desired results, which may not be outright fraud but can lead to misleading conclusions. |
| Fabrication | Making up data or results and recording or reporting them. |
| Falsification | Manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record. |
| Biological replicate | A repeat measurement or experiment on a different biological sample under the same conditions. |
| Standard error bars | Error bars that represent the standard error of the mean (SEM), indicating the accuracy of the sample mean as an estimate of the population mean. |
| 95% confidence interval error bars | Error bars representing the 95% confidence interval for the mean, indicating a range within which the true population mean is likely to lie. |
| Paired t-test | A statistical test used to compare the means of two related samples, such as measurements taken from the same subjects before and after an intervention. |
| One-tailed test | A hypothesis test that rejects the null hypothesis if the test statistic is too large or too small, in a specific direction. |
| Two-tailed test | A hypothesis test that rejects the null hypothesis if the test statistic is too large or too small in either direction. |
| Family-wise error rate (FWER) | The probability of making at least one Type I error among a series of hypothesis tests. |
| Sum of Squares (SS) | A measure of the total variation in a dataset, calculated as the sum of the squared differences between each data point and the mean. |
| Degrees of Freedom (DF) | The number of independent values that can vary in the computation of a statistic. |
| F-statistic | The statistic used in ANOVA and other F-tests, calculated as the ratio of two variances. |