PGA - Scoring Average Confidence Intervals

Single Mean Confidence Intervals
Single Mean Hypothesis Testing
Exploring Single Mean Confidence Intervals with Golf Data
Author
Affiliation

Jonathan Lieb

Baylor University

Published

January 25, 2025

  • This module would be suitable for an in-class lab or take-home assignment in an introductory statistics course.

  • It assumes a basic familiarity with the RStudio Environment and R programming language.

  • Students should be provided with the following data file (.csv) and Quarto document (.qmd) to produce visualizations and write up their answers to each exercise. Their final deliverable is to turn in an .html document produced by “Rendering” the .qmd.

  • The data for this module is derived largely from the ESPN website. However, the tours were manually added to the data based on the what they played on in 2023 at the time of the Masters.

Welcome video

Introduction

In this module, you will be exploring the concepts of single mean confidence intervals and single mean hypothesis tests using data from the 2023 Masters tournament. The Masters is considered to be one of the greatest and most selective tournaments in the world of golf. Only the best current players or previous winners are allowed in. The winner of the Masters gets to wear the famed “Green Jacket” and return to play any year they would like at the Masters. The course the Masters is played at, Augusta National, is one of the most beautiful and challenging courses in the world. The course is known for its fast greens and tight fairways. The Masters is the first major of the year and is played in early April. The tournament is played over four days. After 2 days, the top 50 players and ties make the cut and play the final two days.

Master’s Leaderboard

Image Source: Ryan Schreiber, CC BY 2.0, via Wikimedia Commons

View the course at Augusta National here

As mentioned before, only the best players and previous winners can compete in the tournament, With that being said, the Masters is one of the few tournaments that players from the newly created LIV Golf tour can play in, although they are mostly qualifying because of their past performances at the Masters before they joined LIV Golf. This leads to a mixture of regular PGA Tour professionals, LIV golfers, Amateurs, and Seniors playing in the Masters in 2023.

The focus for this module is confidence intervals and hypothesis testing for the true mean scores for different groups of players at Augusta National.

Learning Objectives

By the end of this lesson, you will be able to:

  • Read (import) a dataset into your RStudio Environment

  • Make single mean confidence intervals

  • Perform a hypothesis test for a single mean





NOTE: R is the name of the programming language you are using and RStudio is a IDE (Integrated Development Environment). You may be accessing RStudio through a web-based version called Posit Cloud, but R is still the language.

During this module, you’ll investigate the following research questions:

  • What are the confidence intervals for true scoring for different groups of players at the 2023 Masters Tournament?
  • Is the true mean for different groups of players at The 2023 Masters different than par?
  • Is the true mean of amateur scoring greater than par at Augusta National?
  • How do sample size and confidence level affect the width of a confidence interval?

Getting started: 2023 Masters Data

The first step to any analysis in R is to load necessary packages and data.

You can think of packages like apps on your phone; they extend the functionality and give you access to many more features beyond what comes in the “base package”.

Running the following code will load the tidyverse and BSDA packages and the masters_2023 data we will be using in this module.

TIP: As you follow along in the module, you should run each corresponding code chunk in your .qmd document. To “Run” a code chunk, you can press the green “Play” button in the top right corner of the code chunk in your .qmd. You can also place your cursor anywhere in the line(s) of code you want to run and press “command + return” (Mac) or “Ctrl + Enter” (Windows).

library(tidyverse)
library(BSDA)

masters_2023 <- read_csv("masters_2023.csv")

We can use the glimpse() function to get a quick look at our masters_2023 data. The glimpse code provides the number of observations (Rows) and the number of variables (Columns) in the dataset. The “Rows” and “Columns” are referred to as the dimensions of the dataset. It also shows us the names of the variables and the first few observations for each variable.

glimpse(masters_2023)
Rows: 277
Columns: 4
$ player <chr> "Jon Rahm", "Jon Rahm", "Jon Rahm", "Jon Rahm", "Phil Mickelson…
$ round  <dbl> 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, …
$ score  <dbl> 65, 69, 73, 69, 71, 69, 75, 65, 65, 67, 73, 75, 69, 70, 76, 66,…
$ tour   <chr> "PGA", "PGA", "PGA", "PGA", "LIV", "LIV", "LIV", "LIV", "LIV", …

Another useful function to get a quick look at the data is the head() function. This function shows the first few rows of the dataset.

head(masters_2023)
# A tibble: 6 × 4
  player         round score tour 
  <chr>          <dbl> <dbl> <chr>
1 Jon Rahm           1    65 PGA  
2 Jon Rahm           2    69 PGA  
3 Jon Rahm           3    73 PGA  
4 Jon Rahm           4    69 PGA  
5 Phil Mickelson     1    71 LIV  
6 Phil Mickelson     2    69 LIV  
Exercise 1: Data Structure
  1. How many observations (rows) are in this data?

  2. How many variables (columns) are in the data?

  3. Which variable will be used for computing a single mean confidence interval?


TIP: Type your answers to each exercise in the .qmd document.

Terms to know

Before proceeding with the analysis, let’s make sure we know some important golf terminology that will help us master this module.

Golf Terminology

  • Par in golf is the amount of strokes that a good golfer is expected to take to get the ball in the hole.
    • Each hole in golf has its own par. There are par 3 holes, par 4 holes, and par 5 holes.
    • There are 18 holes on a golf course and the pars of each of these holes sums to par for the course, also known the course par.
  • A round in golf is when a golfer plays the full set of 18 holes on the course.
    • In most professional golf tournaments, all golfers play 2 rounds, the best golfers are selected and those golfers play 2 more rounds for a total of 4 rounds.
Types of Golf Tours
  • In golf there are a few tours, better thought of as leagues, that golfers regularly compete in

    • The PGA Tour, or “Professional Golf Association Tour”, has long been considered the preeminant golfing tour, hosting most tournaments and containing the most skilled members.
    • The LIV Golf tour is a Saudi-backed alternative to the PGA Tour that began playing tournaments in 2022. LIV is the Roman numeral for 54 and is related to the fact that LIV tournaments only allow 54 players and only play 54 holes, compared to the normal PGA Tour 72 holes.
    • The PGA Tour Champions is a branch off of the PGA tour for players 50 or older. It used to be called the “Senior PGA Tour” until 2003, when it began being called the Champions tour
    • An amateur is a golfer who is not yet a professional. They are not allowed to win money in professional golf tournaments. Most amateurs are college golfers.

PGA vs. LIV

Click here to read about LIV golf’s founding and its continued impact on the PGA tour.

Variable descriptions

The masters_2023 data you’ll be analyzing in this module provides scores for each round by each golfer in the 2023 Masters. The data includes the names of golfers, the round, their scores, and their tour.

Variable Descriptions
Variable Description
player Golfer’s name
round Round of the tournament
score Score for the 18-hole course
tour The tour the player generally competes on

Single Mean Confidence Intervals

Single mean confidence intervals give a range of numbers that we can feel confident that the true population mean falls between. In the most accurate sense, a single mean confidence interval is a range of values that, if we were to take many samples and calculate the confidence interval for each sample, a certain percentage of those intervals would contain the true population mean. Because of what a single mean confidence interval is, it is innaccurate to say that there is a 95% chance (if we were constructing a 95% confidence interval) that the true mean falls within the the confidence interval. The true mean is a fixed value, so it either falls within the interval or it does not. The 95% confidence level refers to the long-run proportion of confidence intervals that will contain the true mean if we were to take many samples.This can lead to difficulties interpreting these ranges in practice though. An accurate interpretation of a single mean confidence interval will often follow the structure below:

We are (insert confidence level) confident that the true (insert variable) mean for (insert population group) is between (insert lower bound) and (insert upper bound).

For example:

We are 95% confident that the true scoring average for LIV golfers at Augusta National is between 70.1 and 73.3.

There are two different ways of calculating the confidence interval for a single mean. How we determine which of the two methods to use is based on the sample size and whether or not the population standard deviation is known.

t-interval for single means

A t-distribution is used for calculating a single mean confidence interval if the sample size is small (rule of thumb: less than 30) and the population standard deviation is unknown.

The formula for calculating a CI using this method is shown below: \[CI = \bar{X} \pm t_{\alpha/2, df} \times \frac{S}{\sqrt{n}}\]

Click here to learn more about the t-distribution and play around with t-distribution graphs.

Where \(\bar{X}\) is the sample mean,

\(t_{\alpha/2, df}\) is the critical value for the t-distribution with \(df = n-1\),

\(S\) is the sample standard deviation,

and \(n\) is the sample size.

NOTE: The t-distribution changes based on the degrees of freedom, approaching the normal distribution as the degrees of freedom increase.

The critical value for the t-distribution is determined by the confidence level and the degrees of freedom. This is calculated by finding the value of \(t_{\alpha/2, df}\) such that the area under the t-distribution curve (with that specific degrees of freedom) between \(-t_{\alpha/2, df}\) and \(t_{\alpha/2, df}\) is equal to the confidence level.

NOTE: Using R to calculate the critical value for the t-distribution saves time and adds accuracy compared to using a t-table.

R can easily calculate the critical value for the t-distribution using the qt() function.

The qt() function takes two arguments: the first is the confidence level, and the second is the degrees of freedom. The example below shows how to calculate the t-value for a 95% confidence level with 10 degrees of freedom.

Note that the 95% confidence interval has \(\alpha = .05\) and \(\alpha / 2 = .025\) so we need to use qt(.975, df= 10) or qt(.025, df = 10) to find the critical values.

qt(0.975, df = 10)
[1] 2.228139

We would like to make a 90% confidence interval for the true mean of scoring for Jon Rahm at the Masters. We can pull his scores and put them in a vector with following code. We will also set the sample size (n) to the amount of observations in the sample.

jon_rahm <- masters_2023 |> 
  filter(player == "Jon Rahm") |> 
  pull(score)

n <- length(jon_rahm)

We start by calculating the the sample mean (\(\bar{x}\)).

x_bar <- mean(jon_rahm)

Next we calculate the sample standard deviation (\(s\)).

s <- sd(jon_rahm)

Our next step is to find the critical value for a 90% confidence interval with 3 degrees of freedom (\(t_{\alpha/2, df}\)). We’ll use the qt function from R to do this. Note that we use .95 as the first argument because alpha is .1 and half of our error should be in each end.

cv <- qt(.95, 3)

Lastly we will use the confidence interval formula to find the bounds of our 90% confidence interval.

lower_confidence_boundary <- x_bar - cv * (s/n)
upper_confidence_boundary <- x_bar + cv * (s/n)
paste0("90% confidence interval: (", lower_confidence_boundary, ", ", upper_confidence_boundary, ")")
[1] "90% confidence interval: (67.078486801804, 70.921513198196)"

Finally we can interpret the confidence interval and say that we are 95% confident that the true mean of Jon Rahm’s scores at Augusta National is between 67.1 and 70.9.

Exercise 2: t-distribution confidence intervals

Run the code below to create the proper subset of the data for this exercise and calculate the sample mean and sample standard deviation.

amateurs_round1 <- masters_2023 |> 
  filter(round == 1, tour == "Amateur")
x_bar <- amateurs_round1 |> 
  summarise(mean(score)) |> 
  pull()
std_dev <- amateurs_round1 |> 
  summarize(sd(score)) |> 
  pull()
n <- nrow(amateurs_round1)
t_cv <- qt(0.95, df = n - 1)

Use the formula for creating a confidence interval using the t-distribution to calculate the upper and lower limits of a confidence interval for the true mean of amateur scoring using the amateurs in the first round as our sample. Use a 90% confidence interval (\(\alpha = .1\)).

  1. What is the lower bound of the confidence interval?

  2. What is the upper bound of the confidence interval?

  3. What is the interpretation of this confidence interval?

Repeat this process with a 99% confidence interval (\(\alpha = .01\)).

  1. What is the new lower bound of the confidence interval?

  2. What is the new upper bound of the confidence interval?

  3. Is this 99% confidence interval larger, smaller, or the same as 90% confidence interval?

  4. When thinking about your anwer to f, what do you think could explain this?

TIP: Values and objects can be stored in variables in R. For example, x <- 5 stores the value 5 in the variable x.

TIP: The pipe operator is a powerful tool in R that allows you to chain functions together. It is denoted by |> and is used to pass the output of one function to the input of another function.

TIP: The pull() function is used to extract a single column from a data frame as a vector.

TIP: The nrow() function is used to calculate the number of rows in a data frame.

TIP: When interpreting a confidence interval do not say “there is a 90% chance that the true mean is between the lower and upper bounds”. Instead, say “we are 90% confident that the true mean is between the lower and upper bounds”.

TIP: Only the t_cv variable needs to be changed to recompute with the new confidence interval. Use qt(0.995, df = n - 1) to calculate the critical value for the new 99% confidence interval.

z-interval for single means

A standard normal distribution (also known as a z-distribution) is used to calculate the confidence interval for a single mean if the sample size is large enough (greater than 30) or the population standard deviation is known. The first case is common as oftentimes samples are greater than 30. The second case is rare because it is uncommon to know the population standard deviation but not the population mean.

The formula for the confidence interval for a single mean using the z-distribution is very similar to that of the t-distribution

Click here for more information about the standard normal distribution.

\(CI = \bar{X} \pm Z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}\)

Where \(\bar{X}\) is once again the sample mean, \(Z_{\alpha/2}\) is the critical value for the standard normal distribution at the specified confidence level, \({\sigma}\) is the population standard deviation, and \(n\) is the sample size.

The reasoning behind why we can use the standard normal distribution when the sample is greater than 30 even if the population standard deviation is unknown is found in the Central Limit Theorem, which says that as the sample size increases, the sampling distribution of the sample mean approaches a normal distribution. This means that when the sample size is greater than 30 we can use the sample standard deviation to estimate the population standard deviation and create a confidence interval as seen below

\(CI = \bar{X} \pm Z_{\alpha/2} \times \frac{S}{\sqrt{n}}\)

Once again the critical value for the z-distribution is the value of \(Z_{\alpha/2}\) such that the area under the standard normal distribution curve between \(-Z_{\alpha/2}\) and \(Z_{\alpha/2}\) is equal to the confidence level.

R can compute the z-value for you using the qnorm() function. The qnorm() function takes in the probability and returns the z-value that corresponds to that probability. For example, qnorm(.975) will return the z-value that corresponds to the 97.5th percentile of the standard normal distribution. No degrees of freedom are needed.

The code below calculates the critical values for the z-distribution for a 95% confidence interval.
(Note that .975 is used for a 95% confidence interval because \(\alpha = .05\) and since the confidence interval is two sided we need half of the error each side so \(\alpha / 2 = .025\), which means we want to use qnorm(.975) or qnorm(.025))

qnorm(.975)
[1] 1.959964
qnorm(.025)
[1] -1.959964

We would like to make a 98% confidence interval for the true mean of scoring for PGA Tour golfers based off of a sample from the third round. This sample can be created with the following code.

pga_round3_scores <- masters_2023 |> 
  filter(tour == "PGA", round == 3) |> 
  pull(score)

We can check the size of our sample with the following code.

pga_round3_n <- length(pga_round3_scores)
pga_round3_n
[1] 39

That’s 39 observations we have in our sample, which means we can use the z-distribution.

We start by calculating the the sample mean (\(\bar{x}\)). R can do this very easily as seen below

pga_round3_mean <- mean(pga_round3_scores)

Next we calculate the sample standard deviation (\(s\)), which is our estimate for \(\sigma\). Once again, R can do this quickly.

pga_round3_sd <- sd(pga_round3_scores)

Our next step is to find the critical value for a 98% confidence interval for a z-distribution (\(Z_{\alpha/2}\)). We can use qnorm to do this.

z_cv <- qnorm(.99)

Lastly we will use the confidence interval formula to find the bounds for our 98% confidence interval. \[ CI = \bar{X} \pm Z_{\alpha/2} \times \frac{s}{\sqrt{n}} \]

lower_confidence_boundary <- pga_round3_mean - z_cv * pga_round3_sd /
  sqrt(pga_round3_n)
upper_confidence_boundary <- pga_round3_mean + z_cv * pga_round3_sd /
  sqrt(pga_round3_n)
paste0("98% confidence interval: (", lower_confidence_boundary,
       ", ", upper_confidence_boundary, ")")
[1] "98% confidence interval: (72.1152269671053, 73.9360550841767)"

Finally we can interpret the confidence interval and say that we are 98% confident that the true mean of PGA Tour golfers’ scores at the 2023 masters is between 72.1 and 73.9.

Exercise 3: z-distribution confidence intervals

Run the code below to create the proper subset of the data for this exercise and calculate the sample mean, sample standard deviation, and critical value for the z-distribution.

pga_round1 <- masters_2023 |> 
  filter(round == 1, tour == "PGA")
x_bar <- pga_round1 |> 
  summarise(mean(score)) |> 
  pull()
sigma <- pga_round1 |> 
  summarise(sd(score)) |> 
  pull()
z_cv <- qnorm(.975)

Use the formula for creating a confidence interval using the standard normal distribution to calculate the upper and lower limits of a confidence interval for the true mean of PGA professional scoring at Augusta using the PGA pros in the first round as our sample. Use a 95% confidence interval (\(\alpha = .05\)).

  1. Why can we use the standard normal distribution to calculate this confidence interval?

  2. What is the confidence interval for the true mean of scoring for PGA professionals at Augusta National?

  3. What is the interpretation of this confidence interval?

R for Single Mean Confidence Intervals

R functions can help to speed up the process of finding these confidence intervals and will also help us with testing hypotheses later. The t.test function from the stats package makes a confidence interval for the t-distribution and the z.test function from the BSDA package does the same for the standard normal distribution.

Run the code below to view what type of arguments these functions take, what they output, and the see some example uses

?t.test
?z.test

These functions return more than just a confidence interval. If we want to see just the confidence interval $conf.int should be used. Run the example below to see how this is done.

# t-test example
seniors_round1 <- masters_2023 |> 
  filter(round == 1, tour == "Senior")

t.test(seniors_round1$score, conf.level = .95)$conf.int
[1] 72.39194 79.03663
attr(,"conf.level")
[1] 0.95
# z-test example
pga_round2 <- masters_2023 |> 
  filter(round == 2, tour == "PGA")

## Note that the Z test require the standard deviation to be passed in as an argument `sigma.x`
z.test(pga_round2$score,
       sigma.x = sd(pga_round2$score),
       conf.level = .95)$conf.int
[1] 72.24391 73.71973
attr(,"conf.level")
[1] 0.95

TIP: The $ operator is used to access a specific element of a list. In this case, the conf.int element of the list returned by the t.test and z.test functions. It can also be used to access elements of data frames and other objects in R. The line of code seniors_round1$score is used to access the score column of the seniors_round1 data frame.

TIP: Notice that the z.test function requires the standard deviation to be passed in as an argument sigma.x. Use the sd() function to calculate the standard deviation of the sample and use it as an estimate the population standard deviation.

Exercise 4: R for confidence intervals
  1. Use the t.test function to find the confidence interval for amateur scoring using the amateur_round1 sample from earlier in the lesson. What is the confidence interval?
  1. Use the z.test function to find the confidence interval for PGA professional scoring using the pga_round1 sample from earlier in the lesson. What is the confidence interval?

  2. Are these confidence intervals the same as what you calculated manually earlier? If no, why might that be?

TIP If you didn’t create the amateur_round1 sample earlier run the code below

amateur_round1 <- masters_2023 |> 
  filter(round == 1, tour == "Amateur")

TIP If you didn’t create the pga_round1 sample earlier run the code below

pga_round1 <- masters_2023 |> 
  filter(round == 1, tour == "PGA")

Hypothesis Testing with Single Mean Confidence Intervals

Hypothesis testing is a method used to determine if a claim about a population parameter is true or not. In this section, we will perform the most common type of hypothesis test, a single mean test. The null hypothesis (\(H_0\)) is that the population mean is equal to a specific value, and the alternative hypothesis (\(H_a\)) is that the population mean is not equal to that value, greater than that value, or less than that value.

Null Hypothesis: \(H_0: \mu = \mu_0\)

Alternative Hypothesis Options:

\(H_a: \mu \neq \mu_0\) or

\(H_a: \mu > \mu_0\) or

\(H_a: \mu < \mu_0\)

Test Statistics

Like confidence intervals, we have two different tests for hypothesis testing for the population mean. Remember that if the population standard deviation is unknown and the sample size is less than 30, we use the t-distribution. If the population standard deviation is known or the sample size is greater than 30, we use the standard normal distribution.

Each of these distributions have their own tests, the t-test and the z-test. This means that we have different test statistics to calculate depending on the situation.

t-test

The t-test statistic is calculated using the formula:

\[t = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}}\]

where \(\bar{x}\) is the sample mean, \(\mu_0\) is the hypothesized population mean, \(s\) is the sample standard deviation, and \(n\) is the sample size.

z-test

The z-test statistic is calculated using the formula:

\[z = \frac{\bar{x} - \mu_0}{\frac{\sigma}{\sqrt{n}}}\]

where \(\bar{x}\) is the sample mean, \(\mu_0\) is the hypothesized population mean, \(\sigma\) is the population standard deviation, and \(n\) is the sample size.

Once again, if the sample size is over 30 and the population standard deviation is unknown, we use the sample standard deviation to approximate the population standard deviation.

To Reject or Fail to Reject

There are two ways to make a decision about the null hypothesis.

Method 1: Critical values, along with test statistics, can be used to determine if the hypothesized population mean is within the confidence interval for the true mean.

A critical value is a value that separates the rejection region from the non-rejection region. The rejection region is the area where the null hypothesis is rejected. The non-rejection region is the area where the null hypothesis is not rejected. The critical value is determined by the significance level (\(\alpha\)) and the degrees of freedom (if it is a t-test). The critical value is compared to the test statistic to determine if the null hypothesis should be rejected. If the test statistic is within the non-rejection region, the null hypothesis is not rejected. If the test statistic is within the rejection region, the null hypothesis is rejected and the alternative hypothesis is accepted.

Below is an example of using critical values and a test-statistic for a z-test with a 95% confidence level (two-sided). The critical value is 1.96. This means that if the test statistic is greater than 1.96 or less than -1.96, the null hypothesis is rejected. The blue represents the non-rejection region and the red the rejection region. Since the test statistic for this hypothetical example is 1.1 (less than 1.96 and greater than -1.96), we fail to reject the null hypothesis.

This method corresponds directly to the related confidence intervals produced for the sample data.

If the hypothesized population mean is within the confidence interval, we fail to reject the null hypothesis. If the hypothesized population mean is not within the confidence interval, the null hypothesis is rejected and the alternative hypothesis is accepted.

Note: We can say that there is significant evidence to accept the alternative hypothesis if the null hypothesis is rejected. However, it should never be said that we accept the null hypothesis. We can only fail to reject it.

Method 2: The second method is to use a p-value. The p-value is the probability of observing a test statistic as extreme as the one calculated from the sample data given that the null hypothesis is true. The p-value is compared to the significance level (\(\alpha\)) to determine if the null hypothesis should be rejected. If the p-value is less than \(\alpha\), the null hypothesis is rejected. If the p-value is greater than \(\alpha\), the null hypothesis is not rejected.

NOTE: Our alternative hypothesis determines whether we are looking for the probability that the test statistic is greater than or less than the observed value.

  • For a two-sided test, the p-value is the probability that the test statistic is greater than the observed value or less than the negative of the observed value. Find the area in one of the tails and double it.

  • For a left-tailed test (\(H_a: \mu < \mu_0\)), the p-value is the probability that the test statistic is less than the observed value.

  • For a right-tailed test (\(H_a: \mu > \mu_0\)), the p-value is the probability that the test statistic is greater than the observed value.

First let’s visualize what a p-value is telling us.

Now let’s see how we can calculate p-value for our example using R. We have a hypothetical z-test statistic of 1.1 and alpha of .05 for a two-tailed test. We want to find the probability of getting a test statistic less than -1.1 or greater than 1.1. In R the pnorm function calculates the probability that a test statistic is less than a given value. Since we know that the z-distribution is symmetrical we can simply multiply the probability that the test statistic is less than -1.1 by 2 to get our p-value.

p_value <- pnorm(-1.1) * 2
p_value
[1] 0.2713321

Since the p-value is greater than our alpha value of .05 we fail to reject the null hypothesis.

Hypothesis testing in R

Thankfully R can help us with this as well. The t.test function and the z.test function can both perform hypothesis tests, without having to do each step manually.

The code below tests if the true mean is not equal to 50 given that the example_vector is our sample. The confidence level is set to 95%.

example_vector <- seq(10, 100, by = 10)
t.test(example_vector, mu = 50, alternative = "two.sided", conf.level = .95)

    One Sample t-test

data:  example_vector
t = 0.52223, df = 9, p-value = 0.6141
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
 33.34149 76.65851
sample estimates:
mean of x 
       55 

As can be seen in the output of the code above, the p-value is 0.61, which is much greater than the significance level of 0.05. The test-statistic is 0.522 which is within the non-rejection region since critical values for this test would be -2.26 and 2.26. Therefore, we fail to reject the null hypothesis.

Hypothesizing Par as the Population Mean

Augusta National is breathtakingly beautiful, but if golfers get distracted by the scenic views, tall pines, bunkers, water, and azaleas may catch their balls.

Augusta Hole 13

Image Source: Your Golf Travel, CC 4.0

In golf par is considered to be the number of strokes a good golfer is expected to take. The par for the course at Augusta National is 72. It is known that Augusta National is a tougher than usual course but we would like to test if that is the case for different groups.

Our null hypothesis will generally be that the mean of the group is equal to 72.

Exercise 5: t-test for single mean hypothesis testing

Amateurs, who are not yet professional golfers, are generally expected to score higher than professionals. We would like to test if the mean of amateur scoring is above par at Augusta National

  1. What is the null hypothesis for this test?

  2. What is the alternative hypothesis for this test?

  3. Use the t.test function to test if the mean of amateur scoring is greater than 72 using the amateur_round1 sample from earlier in the lesson. What is the p-value?

  4. Based on the p-value, is there statistically significant evidence that the mean of amateur scoring is greater than 72?

Amateurs generally struggle in the Masters, but in 2023 Sam Bennett, a Texas A&M student, made the cut and finished 16th. However, due to his amateur status, he was not eligible to win money and missed out on $261,000.

Exercise 6: z-test for single mean hypothesis testing

PGA professionals would generally average somewhere around par. We would like to test if the mean of PGA professional scoring is not equal to 72 at Augusta National.

  1. What is the null hypothesis for this test?

  2. What is the alternative hypothesis for this test?

  1. Use the z.test function to test if the mean of PGA professional scoring is not equal to 72 using the pga_round1 sample from earlier in the lesson. What is the p-value?

  2. Based on the confidence interval, is there statistically significant evidence that the mean of PGA professional scoring is not equal to 72?

  3. Explain your answer to part d.

TIP Use the $score column from the pga_round1 sample as the first argument in the z.test function

TIP Remember that the z.test function requires the population standard deviation as the second argument. Use the sd function to calculate the standard deviation of the pga_round1 sample.

sd(pga_round1$score)

More practice

If you would like more practice with confidence intervals and hypothesis testing, try the following exercises.

Challenge 1: Critical Values Knowledge Check
liv_round3 <- masters_2023 |> 
  filter(round == 3, tour == "LIV")
  1. What is the degrees of freedom for a single mean hypothesis test for the above sample of Augusta National scores?

  2. Which of the following is the correct formula for the test statistic for a single mean hypothesis test for the above sample?

  1. \(t = \frac{\bar{X} - \mu_0}{\frac{S}{\sqrt{n}}}\)
  2. \(z = \frac{\bar{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}\)
  1. Assume that the alternate hypothesis is that the true mean of scoring for LIV golfers at Augusta is greater than 72. Use the correct formula from part b to calculate the test statistic for the above sample manually. What is the value of the test statistic?

  2. Assuming \(\alpha = 0.05\), what is the critical value for the above sample?

  3. Based on the test statistic and critical value, what is your conclusion?

Fun fact: In the 2023 Masters Tournament, the average score for LIV golfers was just slightly worse (72.91) than the average score for PGA golfers (72.55). However, the average score for amateurs (74.56) and seniors (76.31) was significantly higher.

Challenge 2: P-Value Knowledge check

Answer the following questions based on the sample of Augusta National scores below.

pro_round4 <- masters_2023 |> 
  filter(round == 4, tour == "PGA")
  1. Use the proper function to test if the mean of the sample is equal to 72. What is the p-value?

  2. Based on the p-value, what would be your conclusion if \(\alpha = 0.01\)?

  3. Based on the p-value, what would be your conclusion if \(\alpha = 0.05\)?

  4. Based on the p-value, what would be your conclusion if \(\alpha = 0.10\)?

  5. Based on the p-value, what would be your conclusion if \(\alpha = 0.20\)?

  6. What is wrong with doing multiple hypothesis tests with different \(\alpha\) values that you choose after the fact?

TIP: Check the number of observations in the pro_round4 sample using the nrow function, by looking in your global environment, or by using the glimpse function in order to determine the correct function to use.

Challenge 3: Confidence Intervals Knowledge Check

Answer the following questions based on the 2 samples of Augusta National scores below.

## Sample 1
amateur_round2 <- masters_2023 |> 
  filter(round == 2, tour == "Amateur")

## Sample 2
pga_rounds2_4 <- masters_2023 |> 
  filter(round %in% c(1, 2), tour == "PGA")
  1. How many observations are in sample 1?

  2. How many observations are in sample 2?

  3. Which of the above samples (sample 1 or sample 2) would you expect to have a wider confidence interval for the mean? Why?

  4. How does the sample size affect the width of the confidence interval?

  5. Explain why you answered the way you did in part b.

  6. Using sample 2, if confidence intervals were made for confidence levels of 90%, 95%, and 99%, which confidence interval would be the widest?

  7. What is the relationship between the confidence level and the width of the confidence interval?

Challenge 4: z vs t (large sample size)

You have the following sample of Augusta National scores below. Answer the following questions based on the sample.

pga_rounds1_2 <- masters_2023 |> 
  filter(round %in% c(1, 2), tour == "PGA")
  1. Create a 95% confidence interval for the mean of the sample using the z.test function. What is the confidence interval?

  2. Create a 95% confidence interval for the mean of the sample using the t.test function. What is the confidence interval?

  3. Which of the above confidence intervals is wider?

  4. Based on your answer to the above question, would you say a t-distribution has heavier or lighter tailed than a z-distribution?

  5. Is the difference in the confidence intervals in this problem likely to change a result in a hypothesis test? Why or why not?

Think on it: The average score for the first 2 rounds of the 2023 Masters Tournament was 72.8, while the average score for the last 2 rounds was 73.22. This is interesting because only the top players make it to the last 2 rounds, so you would expect the scores to be lower. What could have changed so that the best players are now scoring worse?

Conclusion

In this module you have learned about single mean confidence intervals and single mean hypothesis tests. You have learned when to use a z-distribution and when to use a t-distribution. How to interpret p-values and confidence intervals was also covered.

With the Masters scoring data, you were able to calculate confidence intervals and test hypotheses about the mean score of different groups of golfers. We saw that in order to form confidence intervals for groups such as amateurs and seniors, we needed to use the t-distribution due to the small sample sizes. With larger sample sizes, such as the PGA golfers, we were able to use the z-distribution. These confidence intervals gave us a range of plausible values for the true mean score of each group. We also tested hypotheses about the mean scoring averages for PGA, LIV, and amateur golfers. We were able to make conclusions about the mean score of each group (whether or not that group’s mean was greater than or not equal to par) based on the p-values of the hypothesis tests.