## Better approaches to making statistical decisions

In establishing statistical significance, the p-value criterion is almost universally used. The criterion is to reject the null hypothesis (H0) in favour of the alternative (H1), when the p-value is less than the level of significance (*α*). The conventional values for this decision threshold include 0.05, 0.10, and 0.01.

By definition, the p-value measures how compatible the sample information is with H0: i.e., P(D|H0), the probability or likelihood of data (D) under H0. However, as made clear from the statements of the American Statistical Association (Wasserstein and Lazar, 2016), the p-value criterion as a decision rule has a number of serious deficiencies. The main deficiencies include

- the p-value is a decreasing function of sample size;
- the criterion completely ignores P(D|H1), the compatibility of data with H1; and
- the conventional values of
*α*(such as 0.05) are arbitrary with little scientific justification.

One of the consequences is that the p-value criterion frequently rejects H0 when it is violated by a practically negligible margin. This is especially so when the sample size is large or massive. This situation occurs because, while the p-value is a decreasing function of sample size, its threshold (*α*) is fixed and does not decrease with sample size. On this point, Wasserstein and Lazar (2016) strongly recommend that the p-value be supplemented or even replaced with other alternatives.

In this post, I introduce a range of simple, but more sensible, alternatives to the p-value criterion which can overcome the above-mentioned deficiencies. They can be classified into three categories:

- Balancing P(D|H0) and P(D|H1) (Bayesian method);
- Adjusting the level of significance (
*α*); and - Adjusting the p-value.

These alternatives are simple to compute, and can provide more sensible inferential outcomes than those solely based on the p-value criterion, which will be demonstrated using an application with R codes.

Consider a linear regression model

Y = β0 + β1 X1 + … + βk Xk + u,

where Y is the dependent variable, X’s are independent variables, and u is a random error term following a normal distribution with zero mean and fixed variance. We consider testing for

H0: β1 = … = βq = 0,

against H1 that H0 does not hold (q ≤ k). A simple example is H0: β1 = 0; H1: β1 ≠ 0, where q =1.

Borrowing from the Bayesian statistical inference, we define the following probabilities:

**Prob(H0|D)**: posterior probability for H0, which is the probability or likelihood of H0 after the researcher observes the data D;

**Prob(H1|D) ≡ 1 — Prob(H0|D)**: posterior probability for H1;

**Prob(D|H0)**: (marginal) likelihood of data under H0;

**Prob(D|H1)**: (marginal) likelihood of data under H1;

**P(H0):** prior probability for H0, representing the researcher’s belief about H0 before she observes the data;

**P(H1) = 1- P(H0)**: prior probability for H1.

These probabilities are related (by Bayes rule) as