ЁЯУК Correlation Analysis
A Comprehensive Guide to Understanding Relationships Between Variables
Table of Contents
- Introduction: Understanding Correlation
- Types of Correlation
- Methods of Correlation Analysis
- Interpreting Correlation Coefficients
- Critical Assumptions for Valid Correlation
- Step-by-Step Calculation Example
- Correlation Matrix Calculator
- Frequently Asked Questions (FAQ)
- Advantages and Limitations
- Conclusion
- Related Articles
- References
1. Introduction: Understanding Correlation
ЁЯОп What You'll Master in This Guide
Correlation analysis is one of the most fundamental statistical techniques for examining relationships between variables. Whether you're analyzing agricultural yields, student performance, medical data, or business metrics, understanding correlation is essential for making informed decisions based on empirical evidence.
In this comprehensive guide, you'll learn:
- The conceptual foundation of correlation and its role in statistical analysis
- Different types of correlation coefficients and when to use each
- Step-by-step calculation procedures with worked examples
- Critical assumptions and potential pitfalls in correlation analysis
- Proper interpretation and academic reporting standards in APA format
What is Correlation?
Correlation quantifies the strength and direction of association between two variables. When Karl Pearson developed the product-moment correlation coefficient in the 1890s, he provided researchers with a powerful tool to measure linear relationships numerically (Field, 2018). Unlike mere observation, correlation provides an objective, standardized measure that ranges from -1 to +1.
The correlation coefficient addresses fundamental research questions such as:
- Do students who study more hours achieve higher examination scores?
- Is there a relationship between fertilizer application and crop productivity in agricultural research?
- Does employee training duration correlate with job performance ratings?
- Are temperature changes associated with plant growth rates?
Bivariate vs. Multivariate Correlation
Bivariate correlation examines the relationship between two variables, providing the foundation for understanding how variables co-vary. This is the most common form of correlation analysis in research (Tabachnick & Fidell, 2019).
Multivariate correlation involves relationships among three or more variables. Techniques like partial correlation and canonical correlation allow researchers to examine complex relationships while controlling for confounding variables (Tabachnick & Fidell, 2019).
2. Types of Correlation
1. Positive Correlation
A positive correlation exists when both variables tend to increase together. As one variable increases, the other variable also tends to increase. The strength of this relationship is indicated by how closely the correlation coefficient approaches +1.00 (Gravetter & Wallnau, 2017).
- Education: Study time and examination scores typically show positive correlation
- Agriculture: Irrigation frequency and crop yield often correlate positively (within optimal ranges)
- Health: Physical activity duration and cardiovascular fitness demonstrate positive association
- Economics: Years of education and income levels generally correlate positively
2. Negative Correlation
A negative (or inverse) correlation occurs when variables move in opposite directions. As one variable increases, the other tends to decrease. The strength is indicated by how closely the coefficient approaches -1.00 (Gravetter & Wallnau, 2017).
- Transportation: Vehicle speed and travel time show negative correlation
- Economics: Product price and consumer demand often correlate negatively
- Health: Exercise frequency and resting heart rate typically show inverse relationship
- Agriculture: Weed density and crop yield demonstrate negative correlation
3. Zero Correlation
Zero correlation indicates no systematic linear relationship between variables. Changes in one variable do not predict changes in the other. However, it's important to note that zero correlation refers specifically to linear relationships; non-linear relationships may still exist (Field, 2018).
- Shoe size and mathematical ability
- Hair color and athletic performance
- Random identification numbers and student grades
- Day of birth and career choice
3. Methods of Correlation Analysis
It can be measured by simple graphical methods like the Scatter Plots, which shows the pattern of association visually, or by statistical methods (numerical) such as Karl Pearson’s coefficient (r) for linear relationships, Spearman’s rank correlation (╧Б) and Kendall’s tau (╧Д) for ranked or non-parametric data. Other methods include the coefficient of concurrent deviations for quick estimation, and partial or multiple correlation to measure relationships while controlling for other variables. Together, these methods help in understanding how strongly and in what way variables are related.
A. Scatter Plots (Visual Understanding)
Scatter plots provide intuitive visual representation of correlation patterns. Each point represents one observation, with the overall pattern revealing the relationship's nature and strength (Agresti & Finlay, 2009).
All points fall exactly on an upward-sloping line
Points cluster tightly around upward trend
Clear upward trend with moderate scatter
Random scatter with no discernible pattern
Clear downward trend with moderate scatter
Points cluster tightly around downward trend
All points fall exactly on a downward-sloping line
Pearson's r may be near zero despite clear relationship
Pearson's correlation coefficient only measures linear relationships. When variables have curvilinear (U-shaped or inverted U-shaped) relationships, Pearson's r may be near zero despite a strong systematic relationship. Always examine scatter plots before interpreting correlation coefficients (Howell, 2013).
B. Correlation Methods and Formulas
1. Pearson Product-Moment Correlation (r)
The Pearson correlation coefficient, developed by Karl Pearson, is the most widely used measure of linear association between two continuous variables. It quantifies the degree to which the relationship between variables approximates a straight line (Rodgers & Nicewander, 1988).
When to Use Pearson's r:
- Continuous variables: Both variables should be measured on interval or ratio scales
- Linear relationship: The association should be approximately linear
- Bivariate normality: For significance testing, variables should follow normal distribution
- Homoscedasticity: Variance should be similar across the range of values
- No extreme outliers: Pearson's r is sensitive to extreme values
\[ r = \frac{\sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i=1}^{n} (X_i - \bar{X})^2} \sqrt{\sum_{i=1}^{n} (Y_i - \bar{Y})^2}} \]
- \( r \) = Pearson correlation coefficient
- \( X_i, Y_i \) = Individual data points for variables X and Y
- \( \bar{X}, \bar{Y} \) = Sample means of X and Y
- \( n \) = Number of paired observations
Interpretation: This formula calculates how much X and Y vary together (covariance) relative to their individual variations (standard deviations).
\[ r = \frac{n\sum XY - (\sum X)(\sum Y)}{\sqrt{[n\sum X^2 - (\sum X)^2][n\sum Y^2 - (\sum Y)^2]}} \]
\[ r = \frac{Cov(X,Y)}{s_X \cdot s_Y} \]
- \( Cov(X,Y) \) = Covariance between X and Y
- \( s_X, s_Y \) = Standard deviations of X and Y
Interpretation: Correlation is standardized covariance, making it scale-independent and bounded between -1 and +1 (Cohen et al., 2003).
Properties of Pearson's r:
- Range: -1 ≤ r ≤ +1
- r = +1 indicates perfect positive correlation
- r = -1 indicates perfect negative correlation
- r = 0 indicates no linear correlation
- Correlation is symmetric: rXY = rYX
- Unitless measure (independent of measurement scales)
- Independent of change of origin and scale
- Correlation does not imply causation
2. Spearman Rank Correlation (╧Б or rs)
Charles Spearman introduced this non-parametric alternative to Pearson's r. It measures the strength of monotonic relationships using ranked data rather than raw scores, making it robust to outliers and suitable for ordinal data (Zar, 2010).
When to Use Spearman's ╧Б:
- Ordinal data: Variables measured on ordinal scales or rankings
- Non-normal distributions: When normality assumption is violated
- Presence of outliers: Ranking reduces outlier influence
- Monotonic relationships: Variables consistently increase or decrease together, but not necessarily linearly
- Small sample sizes: More robust with limited data
\[ \rho = 1 - \frac{6\sum d_i^2}{n(n^2 - 1)} \]
- \( \rho \) (rho) = Spearman's rank correlation coefficient
- \( d_i \) = Difference between paired ranks (RX - RY)
- \( n \) = Number of observations
Process: Rank each variable separately, calculate rank differences, square them, and apply the formula (Zar, 2010).
\[ \rho = 1 - \frac{6(\sum d^2 + T_X + T_Y)}{n(n^2 - 1)} \]
- \( T_X = \frac{\sum (t^3 - t)}{12} \) = Correction factor for ties in X
- \( T_Y = \frac{\sum (t^3 - t)}{12} \) = Correction factor for ties in Y
- \( t \) = Number of observations sharing the same rank
Note: When tied ranks exist, average ranks are assigned and correction factors adjust the calculation (Daniel, 1990).
3. Partial Correlation
Partial correlation measures the relationship between two variables while statistically controlling for the influence of one or more additional variables. This technique is essential when confounding variables may influence the observed relationship (Tabachnick & Fidell, 2019).
Suppose you're examining the relationship between study hours and exam scores. However, both might be influenced by student motivation. Partial correlation allows you to assess the study hours-exam scores relationship while removing the effect of motivation, revealing the "pure" association between study and performance.
\[ r_{XY \cdot Z} = \frac{r_{XY} - r_{XZ} \cdot r_{YZ}}{\sqrt{(1 - r_{XZ}^2)(1 - r_{YZ}^2)}} \]
- \( r_{XY \cdot Z} \) = Partial correlation between X and Y, controlling for Z
- \( r_{XY} \) = Zero-order correlation between X and Y
- \( r_{XZ} \) = Zero-order correlation between X and Z
- \( r_{YZ} \) = Zero-order correlation between Y and Z
Interpretation: This represents the correlation between X and Y after removing the linear influence of Z from both variables (Cohen et al., 2003).
4. Intraclass Correlation (ICC)
The intraclass correlation coefficient assesses agreement or consistency among measurements, commonly used in reliability studies and repeated measures designs. Unlike Pearson's r which compares two different variables, ICC evaluates consistency within groups or across raters (Shrout & Fleiss, 1979).
\[ ICC = \frac{MS_B - MS_W}{MS_B + (k-1)MS_W} \]
- \( MS_B \) = Mean square between groups
- \( MS_W \) = Mean square within groups
- \( k \) = Number of measurements per subject
Applications: Assessing inter-rater reliability, test-retest reliability, and consistency of measurements across time or conditions (Koo & Li, 2016).
4. Interpreting Correlation Coefficients
Strength Classification Guidelines
While interpretation depends on research context, Cohen (1988) provided widely-accepted guidelines for behavioral sciences. Different fields may use varying standards, but these serve as useful benchmarks.
| Coefficient Range | Strength | Interpretation | Example Scenarios |
|---|---|---|---|
| 0.90 to 1.00 (or -0.90 to -1.00) | Very Strong | Nearly perfect relationship | Height in cm vs. inches; Temperature in Celsius vs. Fahrenheit |
| 0.70 to 0.89 (or -0.70 to -0.89) | Strong | Substantial relationship | IQ scores and academic achievement; Training hours and skill proficiency |
| 0.50 to 0.69 (or -0.50 to -0.69) | Moderate | Meaningful relationship | Study time and exam scores; Exercise and weight loss |
| 0.30 to 0.49 (or -0.30 to -0.49) | Weak | Modest relationship | Age and reaction time; Rainfall and crop yield |
| 0.10 to 0.29 (or -0.10 to -0.29) | Very Weak | Minimal relationship | Height and intelligence; Birth order and creativity |
| 0.00 to 0.09 (or -0.09 to 0.00) | Negligible | No meaningful relationship | Shoe size and reading ability; Hair color and math skills |
Statistical Significance vs. Practical Significance
A correlation can be statistically significant (p < 0.05) yet practically meaningless, especially with large samples. Conversely, a practically important correlation might not reach statistical significance with small samples (Sullivan & Feinn, 2012).
Understanding p-values in Correlation
The p-value indicates the probability of obtaining the observed correlation (or stronger) if the true population correlation were zero. It does NOT indicate the strength or importance of the relationship (Wasserstein & Lazar, 2016).
- Large sample (n = 500): r = 0.15 might be statistically significant (p < 0.001) but explains only 2.25% of variance—practically trivial
- Small sample (n = 20): r = 0.40 might not reach significance (p = 0.08) but represents a moderate relationship worth investigating
Coefficient of Determination (r²)
The squared correlation coefficient (r²) indicates the proportion of variance in one variable that can be explained by the other variable. This provides an intuitive measure of practical significance (Hays, 1994).
\[ r^2 = \text{Proportion of explained variance} \]
- r = 0.90 → r² = 0.81 (81% of variance explained)
- r = 0.70 → r² = 0.49 (49% of variance explained)
- r = 0.50 → r² = 0.25 (25% of variance explained)
- r = 0.30 → r² = 0.09 (only 9% of variance explained)
Implication: Even moderate correlations explain relatively modest amounts of variance. This highlights the complexity of real-world phenomena where multiple factors influence outcomes (Cohen et al., 2003).
5. Critical Assumptions for Valid Correlation
Assumptions You Must Verify
Before interpreting correlation results, verify that key assumptions are met. Violating these assumptions can lead to misleading conclusions (Osborne & Waters, 2002).
- Level of Measurement:
- Pearson: Both variables must be continuous (interval or ratio scale)
- Spearman: Variables can be ordinal, interval, or ratio
- Linearity:
The relationship should be linear. Always create a scatter plot to visually inspect this assumption. Non-linear relationships may show weak Pearson correlations despite strong associations (Anscombe, 1973).
⚠️ Anscombe's Quartet: Four datasets with identical correlations (r = 0.816) but completely different patterns. This demonstrates why visual inspection is crucial—never rely on correlation coefficients alone! - Bivariate Normality (for significance testing):
Each variable should be approximately normally distributed at each level of the other variable. While Pearson's r can be calculated regardless of distribution, significance tests and confidence intervals require normality (Bishara & Hittner, 2017).
- Homoscedasticity:
The variance of one variable should be similar across levels of the other variable. Heteroscedasticity (unequal variances) can distort the correlation coefficient and affect inference (Osborne & Waters, 2002).
- No Extreme Outliers:
Pearson's r is highly sensitive to outliers, which can dramatically inflate or deflate the correlation. Always examine scatter plots for influential points. Consider Spearman's ╧Б if outliers are present (Wilcox, 2017).
- Independence of Observations:
Each pair of observations should be independent. Repeated measures or clustered data violate this assumption and require specialized techniques like multilevel modeling (Field, 2018).
Common Pitfalls and Warnings
This is the most important principle in correlation analysis. A strong correlation between variables X and Y could be due to:
- X causes Y
- Y causes X
- A third variable Z causes both X and Y (confounding)
- Pure coincidence (spurious correlation)
Classic Example: Ice cream sales and drowning incidents are positively correlated, but ice cream doesn't cause drowning. Both are influenced by a third variable: summer weather.
When the range of values is restricted (e.g., studying only high-achieving students), correlation coefficients can be artificially reduced. Always consider whether your sample represents the full range of the population (Goodwin & Leech, 2006).
A correlation that appears in aggregated data may reverse or disappear when data are separated into subgroups. Always examine whether relationships hold consistently across different groups (Kievit et al., 2013).
6. Step-by-Step Calculation Example
Example: Study Hours and Exam Performance
Let's calculate Pearson's correlation coefficient for the relationship between weekly study hours and exam scores for eight students.
| Student | Study Hours (X) | Exam Score (Y) |
|---|---|---|
| A | 2 | 65 |
| B | 4 | 70 |
| C | 6 | 75 |
| D | 8 | 82 |
| E | 10 | 88 |
| F | 12 | 92 |
| G | 14 | 95 |
| H | 16 | 98 |
Calculation Steps:
- Calculate necessary sums:
- n = 8 (number of pairs)
- ╬гX = 2+4+6+8+10+12+14+16 = 72
- ╬гY = 65+70+75+82+88+92+95+98 = 665
- ╬гX² = 4+16+36+64+100+144+196+256 = 816
- ╬гY² = 4225+4900+5625+6724+7744+8464+9025+9604 = 56,311
- ╬гXY = 130+280+450+656+880+1104+1330+1568 = 6,398
- Apply the computational formula: \[ r = \frac{n\sum XY - (\sum X)(\sum Y)}{\sqrt{[n\sum X^2 - (\sum X)^2][n\sum Y^2 - (\sum Y)^2]}} \]
- Substitute values: \[ r = \frac{8(6398) - (72)(665)}{\sqrt{[8(816) - (72)^2][8(56311) - (665)^2]}} \]
- Calculate numerator:
- 8(6398) = 51,184
- (72)(665) = 47,880
- Numerator = 51,184 - 47,880 = 3,304
- Calculate denominator:
- For X: 8(816) - (72)² = 6,528 - 5,184 = 1,344
- For Y: 8(56,311) - (665)² = 450,488 - 442,225 = 8,263
- Denominator = √(1,344 × 8,263) = √11,105,472 ≈ 3,332.49
- Final calculation: \[ r = \frac{3,304}{3,332.49} ≈ 0.991 \]
ЁЯУЭ Interpretation:
r = 0.991 indicates a very strong positive correlation between study hours and exam scores. This correlation is statistically significant (p < 0.001) and explains approximately 98.2% of the variance (r² = 0.982) in exam scores based on study hours.
Academic Reporting (APA Format):
"A Pearson product-moment correlation was conducted to examine the relationship between weekly study hours and exam scores. There was a very strong positive correlation between the two variables, r(6) = 0.99, p < .001, indicating that increased study time was associated with higher exam performance. Study hours explained 98.2% of the variance in exam scores."
7. Correlation Matrix Calculator
Calculating correlations manually for multiple variables can be tedious and error-prone. Our Correlation Matrix Calculator instantly computes the Pearson correlation coefficient (r) for each variable pair — along with p-values to test significance. It’s perfect for quick, accurate, and publication-ready statistical analysis.
✨ Key Features:
- Paste your tab-separated dataset directly — no coding or pre-processing needed
- Generates a full Pearson correlation matrix with p-values and sample sizes
- Automatically identifies numeric columns from your data
- Highlights statistically significant relationships (p < 0.05)
- Option to copy the entire matrix to clipboard for reports
- Ideal for students, researchers, and data analysts
8. Frequently Asked Questions (FAQ)
Correlation measures how two variables are related. It tells us whether changes in one variable tend to occur alongside changes in another variable. For example, more study hours often lead to higher exam scores, demonstrating a positive correlation.
Values above 0.70 (positive or negative) are generally considered strong in behavioral sciences. However, "good" depends on context. In agricultural research where many factors influence outcomes, correlations of 0.40-0.60 can be meaningful and practically important.
Yes! A negative correlation means one variable increases as the other decreases. For example, more class absences typically lead to lower grades, showing negative correlation. Negative correlations are just as important and valid as positive ones.
Neither is universally "better"—they serve different purposes:
- Use Pearson for continuous, normally distributed data with linear relationships
- Use Spearman for ordinal data, non-normal distributions, or when outliers are present
If Pearson's assumptions are met, use Pearson as it has greater statistical power. Otherwise, Spearman is the safer choice.
No. Correlation only indicates association, not causation. Just because two variables are correlated doesn't mean one causes the other. There could be a third variable influencing both, or the relationship could be coincidental. To establish causation, you need experimental designs with controlled conditions.
Larger samples provide more reliable estimates. As a general guideline:
- Minimum: n = 30 for basic correlation analysis
- Recommended: n = 50-100 for robust results
- For detecting small effects: n = 200+ may be needed
Remember: larger samples make even trivial correlations statistically significant, so always interpret the effect size (Field, 2018).
Outliers can dramatically affect Pearson's r. Options include:
- Use Spearman's ╧Б which is less sensitive to outliers
- Investigate whether outliers are legitimate data points or errors
- Report correlation both with and without outliers
- Consider robust correlation methods designed for outlier-prone data
Yes, but with caution. Correlation coefficients are standardized measures, making them comparable across studies. However, consider:
- Different sample sizes affect reliability
- Context and measurement methods matter
- Use Fisher's Z-transformation for statistical comparison of correlations
Correlation measures the strength and direction of association between two variables (symmetric relationship). Regression models one variable as a function of another, allowing prediction (asymmetric relationship). Correlation asks "how related?" while regression asks "how much does Y change when X changes?" Learn more in our Linear Regression Guide.
Follow this template:
"A [Pearson/Spearman] correlation was conducted to examine the relationship between [Variable X] and [Variable Y]. There was a [strength] [positive/negative] correlation between the two variables, r(df) = [value], p [< or =] [p-value]."
Example: "A Pearson correlation revealed a strong positive relationship between study hours and exam scores, r(48) = 0.72, p < .001."
9. Advantages and Limitations
✅ Advantages of Correlation Analysis
- Quick identification: Rapidly reveals relationships between variables
- Standardized measure: Coefficients are comparable across different contexts and scales
- Foundation for prediction: Supports predictive modeling and regression analysis
- Versatile applications: Applicable across diverse fields from agriculture to psychology
- Multiple methods: Pearson for continuous data, Spearman for ordinal data
- Visual representation: Scatter plots provide intuitive understanding
- Hypothesis testing: Statistical significance testing available
⚠️ Limitations and Precautions
- No causation: Correlation does not imply cause-and-effect relationships
- Sensitive to outliers: Extreme values can distort Pearson's r dramatically
- Linear relationships only: Pearson's r only measures linear associations
- Range restriction: Limited ranges artificially reduce correlation strength
- Assumes homoscedasticity: Unequal variances affect reliability
- Sample-dependent: Small samples yield unreliable estimates
- Confounding variables: Third variables may drive observed correlations
10. Conclusion
Correlation analysis is a powerful statistical method for exploring relationships between variables. By calculating correlation coefficients and properly interpreting results, you can determine whether variables move together, oppositely, or independently. This fundamental technique serves as the foundation for more advanced statistical analyses and informed decision-making in research.
Key Takeaways:
- Always visualize data with scatter plots before interpreting coefficients
- Choose appropriate methods: Pearson for continuous normal data, Spearman for ordinal or non-normal data
- Remember that correlation does not prove causation
- Consider both statistical and practical significance
- Report results in proper APA format for academic work
- Verify assumptions before drawing conclusions
Whether you're a student, educator, or researcher, mastering correlation is crucial for valid academic reporting and evidence-based conclusions. Use our Correlation Calculator Tool to streamline your analysis and focus on interpretation rather than manual calculations.
Keywords: Correlation Analysis, Pearson Correlation, Spearman Correlation, Partial Correlation, Intraclass Correlation, Statistical Analysis, Bivariate Correlation, Multivariate Correlation, Research Methods, APA Format.
12. References
This guide is based on authoritative statistical literature. All concepts have been explained in our own words while properly citing original sources.
- Agresti, A., & Finlay, B. (2009). Statistical methods for the social sciences (4th ed.). Pearson.
- American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). American Psychological Association.
- Anscombe, F. J. (1973). Graphs in statistical analysis. American Statistician, 27(1), 17-21.
- Bishara, A. J., & Hittner, J. B. (2017). Confidence intervals for correlations when data are not normal. Behavior Research Methods, 49(1), 294-309.
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
- Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.
- Daniel, W. W. (1990). Applied nonparametric statistics (2nd ed.). PWS-Kent.
- Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.
- Goodwin, L. D., & Leech, N. L. (2006). Understanding correlation: Factors that affect the size of r. The Journal of Experimental Education, 74(3), 249-266.
- Gravetter, F. J., & Wallnau, L. B. (2017). Statistics for the behavioral sciences (10th ed.). Cengage Learning.
- Hays, W. L. (1994). Statistics (5th ed.). Harcourt Brace.
- Howell, D. C. (2013). Statistical methods for psychology (8th ed.). Wadsworth Cengage Learning.
- Kievit, R. A., Frankenhuis, W. E., Waldorp, L. J., & Borsboom, D. (2013). Simpson's paradox in psychological science: A practical guide. Frontiers in Psychology, 4, 513.
- Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155-163.
- Osborne, J. W., & Waters, E. (2002). Four assumptions of multiple regression that researchers should always test. Practical Assessment, Research & Evaluation, 8(2), 1-9.
- Rodgers, J. L., & NicewanderReputation, W. A. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1), 59-66.
- Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420-428.
- Sullivan, G. M., & Feinn, R. (2012). Using effect size—or why the p value is not enough. Journal of Graduate Medical Education, 4(3), 279-282.
- Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
- Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133.
- Wilcox, R. R. (2017). Introduction to robust estimation and hypothesis testing (4th ed.). Academic Press.
- Zar, J. H. (2010). Biostatistical analysis (5th ed.). Pearson.