Unit 5 Unit 8

Unit 7 - Inferential Statistics and Hypothesis Testing

Part 1: Hypothesis Testing Worksheet

This unit deepened my understanding of hypothesis testing through practical applications using Excel, focusing on both related samples t-tests and independent samples t-tests. The process involved assessing statistical significance between paired conditions (e.g., container design) and independent groups (e.g., weight loss across diets).

In the related samples test (Example 7.1), the comparison of two container designs used a paired t-test. The observed mean difference of 13.2 items sold was statistically significant (p = 0.018), indicating a meaningful design effect. Similarly, the independent samples test for weight loss (Example 7.2) showed a significant mean difference of 1.63 kg in favour of Diet A (p = 0.0028), reinforcing the importance of checking assumptions such as equal variances and distribution normality before drawing conclusions.

Many of the statistical concepts and skills required in this unit—such as hypothesis formulation, working with p-values, and interpreting t-test outputs—had already been introduced in last year’s Numerical Analysis module. These earlier learnings proved invaluable and came back into play here, although I was challenged to apply them more rigorously and in a broader applied context.

The exercises further strengthened my capabilities in:

  • Formulating null and alternative hypotheses appropriately
  • Using Excel’s Data Analysis Toolpak for conducting t-tests
  • Interpreting p-values, t-statistics, and confidence intervals in context
  • Recognising when to apply one-tailed vs two-tailed tests based on research aims

These techniques are vital for rigorous data-driven evaluation in my MSc project, particularly in validating predictive model outputs or comparing system performance under different experimental conditions. Going forward, I aim to incorporate more robust checks of distributional assumptions (e.g., normality plots) and to critically assess test selection in applied contexts.

Part 2: Summary Measures Worksheet

This unit focused on generating and interpreting descriptive statistics using Excel, with applications across both continuous and categorical data. Through structured exercises, I calculated and compared measures of central tendency and dispersion for weight loss across two diets, as well as frequency distributions for brand preference by demographic area.

For Diet A, I found a sample mean of 5.341 kg and a standard deviation of 2.536 kg, suggesting a generally effective outcome. The interquartile range (3.285 kg) and median (5.642 kg) further confirmed consistency in the weight loss data. When performing the same calculations for Diet B, the comparison revealed a smaller average weight loss and higher variability, indicating lower relative effectiveness.

I also used COUNTIF and percentage formulas to analyse brand preferences across two regions. This exercise demonstrated how frequency and relative frequency distributions can make categorical data far more interpretable than raw listings.

The Excel-based approach reinforced practical statistical skills while building on foundational knowledge from last year's Numerical Analysis module. Concepts such as sample variance, interquartile range, and proportion calculations were already familiar, but applying them across varied data types and contexts strengthened my confidence and flexibility in data interpretation.