Effect size

In statistics, an effect size is a quantitative measure of the magnitude of a phenomenon. It can refer to the value of a statistic calculated from a sample of data, the value of one parameter for a hypothetical population, or the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, and the risk of a particular event (such as a heart attack). Effect sizes are a complementary tool for statistical hypothesis testing, and play an important role in statistical power analyses to assess the sample size required for new experiments. Effect size calculations are fundamental to meta-analysis, which aims to provide the combined effect size based on data from multiple studies. The group of data-analysis methods concerning effect sizes is referred to as estimation statistics.

Effect size is an essential component in the evaluation of the strength of a statistical claim, and it is the first item (magnitude) in the MAGIC criteria. The standard deviation of the effect size is of critical importance, as it indicates how much uncertainty is included in the observed measurement. A standard deviation that is too large will make the measurement nearly meaningless. In meta-analysis, which aims to summarize multiple effect sizes into a single estimate, the uncertainty in studies' effect sizes is used to weight the contribution of each study, so larger studies are considered more important than smaller ones. The uncertainty in the effect size is calculated differently for each type of effect size, but generally only requires knowing the study's sample size (N), or the number of observations (n) in each group.

Reporting effect sizes or estimates thereof (effect estimate [EE], estimate of effect) is considered good practice when presenting empirical research findings in many fields. The reporting of effect sizes facilitates the interpretation of the importance of a research result, in contrast to its statistical significance. Effect sizes are particularly significant in social science and medical research, with the latter emphasizing the importance of the magnitude of the average treatment effect.

Effect sizes may be measured in relative or absolute terms. In relative effect sizes, two groups are directly compared with each other, as in odds ratios and relative risks. A larger absolute value always indicates a stronger effect for absolute effect sizes. Many types of measurements can be expressed as either absolute or relative, and these can be used together because they convey different information. A prominent task force in the psychology research community made the following recommendation:

Always present effect sizes for primary outcomes...If the units of measurement are meaningful on a practical level (e.g., number of cigarettes smoked per day), then we usually prefer an unstandardized measure (regression coefficient or mean difference) to a standardized measure (r or d).