Is a statement that there exists a difference between a parameter and a specific value?

Test of HypothesisLesson 1: Basic ConceptsStatisticsandProbability

Lesson OutlineBasic Concepts on Hypothesis TestingTypes of HypothesisTests of HypothesisTypes of ErrorsSteps in Hypothesis TestingAccepting or Rejecting Null Hypothesis

Basic ConceptsStatistical Hypothesisstatement about the numerical value of the population parametertentative assertion which aims to explain facts about a certainphenomenonTwo Kinds of Hypothesis:Null HypothesisAlternative Hypothesis

Basic ConceptsNull Hypothesisdenoted byHostatement that there are no difference between a parameterand a specific valueAlternative Hypothesisdenoted byHaopposite or negation of the null hypothesisstatement that there exists a difference between aparameter and a specific value

Formulating HypothesisExample 1Claim:The average monthly income of Filipino families whobelong to the low income bracket is Php 8,000.Ho:The average monthly income of Filipino families who belong to thelow income bracket is Php 8,000.(𝜇 = 8000)Ha:The average monthly income of Filipino families who belong to thelow income bracket is not equal to Php 8,000.(𝜇 ≠ 8000)

Example 2Claim:The average number of hours that it takes a ten-year-old child tolearn a certain task in a specific subject is less than 0.52 hour.Ho:The average number of hours that it takes a ten-year-old child tolearn a certain task in a specific subject is equal to 0.52 hour.(𝜇 = 0.52)Ha:The average number of hours that it takes a ten-year-old child tolearn a certain task in a specific subject is less than 0.52 hour.(𝜇 < 0.52)Formulating Hypothesis

Example 3Claim:The average weight loss for a sample of people who exercise 30minutes per day for 6 weeks is greater than 3.7 kg.

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 25 pages?

Upload your study docs or become a

Course Hero member to access this document

Tags

Null hypothesis, Statistical hypothesis testing

Testing Statistical Hypotheses

Sheldon M. Ross, in Introductory Statistics (Fourth Edition), 2017

Key Terms

Statistical hypothesis: A statement about the nature of a population. It is often stated in terms of a population parameter.

Null hypothesis: A statistical hypothesis that is to be tested.

Alternative hypothesis: The alternative to the null hypothesis.

Test statistic: A function of the sample data. Depending on its value, the null hypothesis will be either rejected or not rejected.

Critical region: If the value of the test statistic falls in this region, then the null hypothesis is rejected.

Significance level: A small value set in advance of the testing. It represents the maximal probability of rejecting the null hypothesis when it is true.

Z test: A test of the null hypothesis that the mean of a normal population having a known variance is equal to a specified value.

p value: The smallest significance level at which the null hypothesis is rejected.

One-sided tests: Statistical hypothesis tests in which either the null or the alternative hypothesis is that a population parameter is less than or equal to (or greater than or equal to) some specified value.

t test: A test of the null hypothesis that the mean of a normal population having an unknown variance is equal to a specified value.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128043172000096

Statistical Methods for Physical Science

John Kitchin, in Methods in Experimental Physics, 1994

6.4.1 Statistical Hypotheses and Decision Making

A statistical hypothesis is a formal claim about a state of nature structured within the framework of a statistical model. For example, one could claim that the median time to failure from (acce]erated) electromigration of the chip population described in Section 6.1.4 is at least 60 hrs, perhaps to address Question I of Table II where 60 hrs represents a reliability requirement.

Within the framework of the statistical model for the chip population failure times (again, see Section 6.1.4), the reliability claim would be stated as

(6.39)H0:μ≥4.1

since log(60) =·4.1 and the log of the median of a lognormal distribution is the mean of the corresponding normal distribution.

The label H0 arises from the term null hypothesis or more generally working hypothesis. Scientifically speaking, H0:μ≥4.1 is posed and allowed to stand until it can be falsified. The statistical decision that H0 is false (and is rejected) must be based on a decision procedure that combines some function of the observable in the statistical model with the stipulations of the hypothesis—data meets theory.

More generally for a parameter θ, a working hypothesis can be given as H0:θ∈Ω0 , where Ω0 is a set of real numbers bounded by a θ0, yielding one of three cases:θ=θ0 ,θ≤θ0, or (like our reliability requirement example above) θ≥θ0.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/S0076695X08602562

Hypothesis Testing and Confidence Intervals

T.R. Konold, X. Fan, in International Encyclopedia of Education (Third Edition), 2010

Directional and Nondirectional Alternative Hypotheses

Hypothesis testing involves two statistical hypotheses. The first is the null hypothesis (H0) as described above. For each H0, there is an alternative hypothesis (Ha) that will be favored if the null hypothesis is found to be statistically not viable. The Ha can be either nondirectional or directional, as dictated by the research hypothesis. For example, if a researcher only believes the new instructional approach will have an impact on student test scores, but is unsure whether the effect will be positive or negative, the null and alternative hypotheses would be

H0:μ=72

Ha:μ≠ 72

Here, Ha reflects the researcher's uncertainty regarding the directionality, and it allows for a statistical test that considers both possibilities that the new instructional approach could increase test scores or decrease test scores. This is commonly referred to as a nondirectional alternative hypothesis, and is also referred to as a two-tailed test for reasons that are described below.

A directional alternative hypothesis, on the other hand, is useful to accommodate the researcher's prediction that, for example, the new instructional approach will decrease test scores (Ha: μ < 72 bpm) or will increase test scores (Ha: μ > 72). A directional alternative hypothesis is often referred to as a one-tailed test as described below. It is important to note, however, that for every specified H0 there will be a single Ha that may assume one of the three forms

Ha:θ≠K

Ha:θ

Ha:θ>K

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780080448947013373

Inductive Logic

Jan-Willem Romeijn, in Handbook of the History of Logic, 2011

5 Bayesian Statistics

The foregoing introduced Carnapian inductive logic. Now we can start answering the central question of this chapter. Can inductive logic, Carnapian or otherwise, accommodate statistical procedures?

The first statistical procedure under scrutiny is Bayesian statistics. The defining characteristic of this kind of statistics is that probability assignments do not just range over data, but that they can also take statistical hypotheses as arguments. As will be seen in the following, Bayesian inference is naturally represented in terms of a non-ampliative inductive logic. Moreover, it relates very naturally to Carnapian inductive logic.

Let H be the space of statistical hypotheses hθ, and let Q be the sample space as before. The functions P are probability assignments over the entire space H×Q . Since the hypotheses hθ are members of the combined algebra, the conditional functions P(st|hθ) range over the entire algebra Q. We can then define Bayesian statistics as follows.

DEFINITION 1 Bayesian Statistical Inference. Assume the prior probability P(hθ) assigned to hypotheses hθ∈H, with θ ∈ Θ, the space of parameter values. Further assume P(st|hθ), the probability assigned to the data st conditional on the hypotheses, called the likelihoods. Bayes' theorem determines that

(6)P(hθ|st)=P(h θ)P(st|hθ)P (st).

Bayesian statistics outputs a posterior probability assignment, P(hθ|st).

I refer to [Barnett, 1999] and [Press, 2003] for a detailed discussion. The further results form a Bayesian inference, such as estimations and measures for the accuracy of the estimations, can all be derived from the posterior distribution over the statistical hypotheses.

In this definition the probability of the data P(st) is not presupposed, because it can be computed from the prior and the likelihoods by the law of total probability,

P(st)=∫ΘP(hθ)P(st|hθ)dθ.

The result of a Bayesian statistical inference is not always a complete posterior probability. Often the interest is only in comparing the ratio of the posteriors of two hypotheses. By Bayes' theorem we have

P(hθ|st)P(hθ′|st)=P(hθ)P( st|hθ)P(hθ′)P(st|hθ′),

and if we assume equal priors P(hθ) = P(hθ′ ), we can use the ratio of the likelihoods of the hypotheses, the so-called Bayes factor, to compare the hypotheses.

Let me give an example of a Bayesian procedure. Say that we are interested in the colour composition of pears from Emma's farm, and that her pears are red, qi0, or green, qi1. Any ratio between these two kinds of pears is possible, so we have a set of so-called multinomial hypotheses hθ for which

(7)Phθ(qt1|st−1)=θ,Phθ(qt0|st−1)=1−θ

where θ is parameter in the interval [0,1]. The hypothesis hθ fixes the portion of green pears at θ, and therefore, independently of what pears we saw before, the probability that a randomly drawn pear from Emma's farm is green is θ. The type of distribution over Q that is induced by these hypotheses is sometimes called a Bernoulli distribution, or a multinomial distribution.

We now define a Bayesian statistical inference over these hypotheses. Instead of directly choosing among the hypotheses on the basis of the data, as classical statistics advises, we assign a probability distribution over the hypotheses, expressing our epistemic uncertainty. For example, we may choose a so-called Beta distribution,

(8) P(hθ)=Norm×θλ/2−1 (1−θ)λ/2−1

with θ ∈ Θ = [0,1] and Norm a normalisation factor. For λ = 2, this function is uniform over the domain. Now say that we observe a sequence of pears st = sk1…kt , and that we write t1 as the number of green pears, or 1's, in the sequence st, and t0 for the number of 0's, so t0 + t1 = t. The probability of this sequence st given the hypothesis hθ is

(9)P(st|hθ)=∏i=1tPhθ(qik i|si−1)=θt1( 1−θ)t0.

Note that the probability of the data only depends on the number of 0's and the number of 1's in the sequence. Applying Bayes' theorem then yields, omitting a normalisation constant,

(10) P(hθ|st)=Norm'×θλ/2−1+t1(1−θ)λ /2−1+t0.

This is the posterior distribution over the hypotheses. It is derived from the choice of hypotheses, the prior distribution over them, and the data by means of the axioms of probability theory, specifically by Bayes' theorem.

Most of the controversy over the Bayesian method concerns the determination and interpretation of the probability assignment over hypotheses. As will become apparent in the following, classical statistics objects to the whole idea of assigning probabilities to hypotheses. The data have a well-defined probability, because they consist of repeatable events, and so we can interpret the probabilities as frequencies, or as some other kind of objective probability. But the probability assigned to a hypothesis cannot be understood in this way, and instead expresses an epistemic state of uncertainty. One of the distinctive features of classical statistics is that it rejects such an epistemic interpretation of the probability assignment, and that it restricts itself to a straightforward interpretation of probability as relative frequency.

Even if we buy into this interpretation of probability as epistemic uncertainty, how do we determine a prior probability? At the outset we do not have any idea of which hypothesis is right, or even which hypothesis is a good candidate. So how are we supposed to assign a prior probability to the hypotheses? The literature proposes several objective criteria for filling in the priors, for instance by maximum entropy or by other versions of the principle of indifference, but something of the subjectivity of the starting point remains. The strength of classical statistical procedures is that they do not need any such subjective prior probability.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978044452936750015X

Nonparametric Hypotheses Tests

Sheldon M. Ross, in Introductory Statistics (Fourth Edition), 2017

Summary

In this chapter we learned how to test a statistical hypothesis without making any assumptions about the form of the underlying probability distributions. Such tests are called nonparametric.

Sign Test The sign test can be used to test hypotheses concerning the median of a distribution. Suppose that for a specified value m we want to test

H0:η=m

against

H1 :η≠m

where η is the median of the population distribution. To obtain a test, choose a sample of elements of the population, discarding any data values exactly equal to m. Suppose n data values remain. The test statistic of the sign test is the number of remaining values that are less than m. If there are i such values, then the p value of the sign test is given by

p value={2P{N≤i}ifundefinedi≤n22P{N≥i }ifundefinedi≥n2

where N is a binomial random variable with parameters n and p=1/2. The computation of the binomial probability can be done either by running Program 5-1 or by using the normal approximation to the binomial.

The sign test can also be used to test the one-sided hypothesis

H0:η≤magainstH1:η>m

It uses the same test statistic as earlier, namely, the number of data values that are less than m. If the value of the test statistic is i, then the p value is given by

p value=P{N≤i}

where again N is binomial with parameters n and p=1/2.

If the one-sided hypothesis to be tested is

H0: η≥magainstH1:η

then the p value, when there are i values less than m, is

p value=P{N≥i}

where N is binomial with parameters n and p=1/2.

As in all hypothesis testing, the null hypothesis is rejected at any significance level greater than or equal to the p value.

Signed-Rank Test The signed-rank test is used to test the hypothesis that a population distribution is symmetric about the value 0. In applications, the population often consists of the differences of paired data. The signed-rank test calls for choosing a random sample from the population, discarding any data values equal to 0. It then ranks the remaining nonzero values, say there are n of them, in increasing order of their absolute values. The test statistic is equal to the sum of the rankings of the negative data values. If the value of the test statistic TS is equal to t, then the p value is

p value=2Min(P{TS≤t},P{TS≥ t})

where the probabilities are to be computed under the assumption that the null hypothesis is true. The p value can be found either by using Program 14-1 or by using the fact that TS will have approximately, when the null hypothesis is true and n is of least moderate size, a normal distribution with mean and variance, respectively, given by

E[TS]=n(n+1)4Var(TS )=n(n+1)(2n+1)24

Rank-Sum Test The rank-sum test can be used to test the null hypothesis that two population distributions are identical, when the data consist of independent samples from these populations. Arbitrarily designate one of the samples as the first sample. Suppose that the size of this sample is n and that of the other sample is m. Now rank the combined samples. The test statistic TS of the rank-sum test is the sum of the ranks of the first sample. The rank-sum test calls for rejecting the null hypothesis when the value of the test statistic is either significantly large or significantly small.

When n and m are both greater than 7, the test statistic TS will, when H0 is true, have an approximately normal distribution with mean and variance given by, respectively,

E[TS]=n(n+m+1)2Var(TS)=nm(n+m+1)12

This enables us to approximate the p value, which when TS =t is given by

p value≈{2P{Z≤t+0.5−n(n+m+1)/2nm(n+m +1)/12}if tn(n+m+1)2

For values of t near n(n+m+1)/2, the p value is close to 1, and so the null hypothesis would not be rejected (and the preceding probability need not be calculated).

For small values of n and m the exact p value can be obtained by running Program 14-2.

Runs Test The runs test can be used to test the null hypothesis that a given sequence of data constitutes a random sample from some population. It supposes that each datum is either a 0 or a 1. Any consecutive sequence of either 0s or 1s is called a run. The test statistic for the runs test is R, the total number of runs. If the observed value of R is r, then the p value of the runs test is given by

p value=2Min(P{R≤r},P{R≥r})

The probabilities here are to be computed under the assumption that the null hypothesis is true.

Program 14-3 can be used to determine this p value. If Program 14-3 is not available, we can approximate the p value by making use of the fact that when the null hypothesis is true, R will have an approximately normal distribution. The mean and variance, respectively, of this distribution are

μ=2nmn+m+1 σ2=2nm(2nm−n−m) (n+m)2(n+m−1)

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012804317200014X

Hypothesis Testing

R.H. Riffenburgh, in Statistics in Medicine (Third Edition), 2012

Why the Null Hypothesis Is Null

It may seem a bit strange at first that our primary statistical hypothesis in testing for a difference says there is no difference, even when, according to our clinical hypothesis, we believe there is one, and might even prefer to see one. The reason lies in the ability to calculate errors in decision making. When the hypothesis says that our sample is no different from known information, we have available a known probability distribution and therefore can calculate the area under the distribution associated with the erroneous decision: a difference is concluded when in truth there is no difference. This area under the probability curve provides us with the risk for a false-positive result. The alternate hypothesis, on the other hand, says just that our known distribution is not the correct distribution, not what the alternate distribution is. Without sufficient information regarding the distribution associated with the alternate hypothesis, we cannot calculate the area under the distribution associated with the erroneous decision: no difference exists when there is one, that is, the risk for a false-negative result.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123848642000081

Hypothesis testing

Kandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Third Edition), 2021

6.6 Chapter summary

In this chapter, we have learned various aspects of hypothesis testing. First, we dealt with hypothesis testing for one sample where we used test procedures for testing hypotheses about true mean, true variance, and true proportion. Then we discussed the comparison of two populations through their true means, true variances, and true proportions. We also introduced the Neyman–Pearson lemma and discussed likelihood ratio tests and chi-square tests for categorical data.

We now list some of the key definitions in this chapter.

Statistical hypotheses

Tests of hypotheses, tests of significance, or rules of decision

Simple hypothesis

Composite hypothesis

Type I error

Type II error

The level of significance

The p value or attained significance level

The Smith–Satterthwaite procedure

Power of the test

Most powerful test

Likelihood ratio

In this chapter, we also learned the following important concepts and procedures:

General method for hypothesis testing

Steps to calculate β

Steps to find the p value

Steps in any hypothesis-testing problem

Summary of hypothesis tests for μ

Summary of large sample hypothesis tests for p

Summary of hypothesis tests for the variance σ2

Summary of hypothesis tests for μ1 − μ2 for large samples (n1 and n2 ≥ 30)

Summary of hypothesis tests for p1 − p2 for large samples

Testing for the equality of variances

Summary of testing for a matched pairs experiment

Procedure for applying the Neyman–Pearson lemma

Procedure for the likelihood ratio test

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128178157000063

Sample Size

Chirayath M. Suchindran, in Encyclopedia of Social Measurement, 2005

Basic Principles

Sampling techniques are used either to estimate statistical quantities with desired precision or to test statistical hypotheses. The first step in the determination of the sample size is to specify the design of the study (simple random samples of the population, stratified samples, cluster sampling, longitudinal measurement, etc.). If the goal is statistical estimation, the endpoint to be estimated and the desired precision would be specified. The desired precision can be stated in terms of standard error or a specified confidence interval. If the goal is to conduct statistical testing, the determination of sample size will involve specifying (1) the statistical test employed in testing the differences in end point, (2) the difference in the end point to be detected, (3) the anticipated level of variability in the end point (either from previous studies or from theoretical models), and (4) the desired error levels (Type I and Type II errors). The value of increased information in the sample is taken into consideration in the context of the cost of obtaining it. Guidelines are often needed for specifications of effect size and associated variability. One strategy is to take into account as much available prior information as possible. Alternatively, a sample size is selected in advance and the information (say, power or effect size) that is likely to be obtained with that sample size is examined. Large-scale surveys often aim to gather many items of information. If a desired degree of precision is prescribed for each item, calculations may lead to a number of different estimates for the sample size. These are usually compromised within the cost constraint. Sample size determinations under several sampling designs or experimental situations are presented in the following sections.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985000578

Answers to Chapter Exercises, Part I

ROBERT H. RIFFENBURGH, in Statistics in Medicine (Second Edition), 2006

CHAPTER 5

5.1

It is a clinical hypothesis, stating what the investigator suspects is happening. A statistical hypothesis would be the following: Protease inhibitors do not change the rate of pulmonary admissions. By stating no difference, the theoretical probability distribution can be used in the test. (If there is a difference, the amount of difference is unknown, and thus the associated distribution is unknown.)

5.2

Ho: μw= μw/o; H1: μw≠ μw/o.

5.3

(a) The t distribution is associated with small data sample hypotheses about means. (b) Assumptions include: The data are independent one from another. The data samples are drawn from normal populations. The standard deviations at baseline and at 5 days are equal. (c) The Type I error would be concluding that the baseline mean and the 5-day mean are different when, in fact, they are not. The Type II error would be concluding the two means are the same when, in fact, they are different. (d) The risk for a Type I error is designated α The risk for a Type II error is designated β. The power of the test would be designated 1 – β.

5.4

(a) The x2 distribution is associated with a hypothesis about the variance. (b) Assumptions include: The data are independent one from another. The data sample is drawn from a normal population. (c) The Type I error would be concluding that the platelet standard deviation is different from 60,000 when, in fact, it is 60,000. The Type II error would be concluding the platelet standard deviation is 60,000 when, in fact, it is not. (d) The risk for a Type I error is designated α. The risk for a Type II error is designated β.

5.5

(a) The F distribution is associated with a hypothesis about the ratio of two variances. (b) Assumptions include: The data are independent one from another. The data samples are drawn from normal populations. (c) The Type I error would be concluding the variance (or standard deviation) of serum silicon before the implant removal is different from the variance (or standard deviation) after when, in fact, they are the same. The Type II error would be concluding the before and after variances (or standard deviations) are the same when, in fact, they are different. (d) The risk for a Type I error is designated α. The risk for a Type II error is designated β.

5.6

The “above versus below” view would interpret this decrease as not significant, end of story. Exercise has no effect on the eNO of healthy subjects. The “level of p” view

would say that, although the 2.15-ppb decrease has an 8% chance of being false, it also has a 92% chance of being correct. The power of the test should be evaluated. Perhaps the effect of exercise on eNO in healthy subjects should be investigated further.

5.7.

(a) A categorical variable. (b) A rating that might be any type, depending on circumstance. In this case, treating it as a ranked variable is recommended, because it avoids the weaker methods of categorical variables and a five-choice is rather small for use as a continuous variable. (c) A continuous variable.

5.8.

1, 7, 8, 3, 5, 2, 4, 6.

5.9.

(a)

Exercise-induced bronchoconstriction (EIB) frequency by sex:

SexTotals
1 (male)2 (female)
0 23 9 32
1 5 1 6
Totals 28 10 38

(b)

5-minute eNO differences by sex as ranks:

5-minute difference5-minute ranksSexRanks for males (1)Ranks for females (2)
6.0 5 1 5
–1.0 4 1 4
7.1 6 1 6
–3.9 1 2 1
–2.0 2.5 1 2.5
–2.0 2.5 1 2.5

(c) 5-minute eNO differences by sex as continuous measurements:

5-minute differenceSexDiffernce for males (1)Differnce for males (2)
6.0 1 6.0
−1.0 1 -1.0
7.1 1 7.1
−3.9 2 -3.9
−2.0 1 −2.0
−2.0 1 −2.0
m = 1.62, s = 4.54 m = −3.9. no s

5.10.

(1) Does the drug reduce nausea score following gallbladder removal? (2 and 3) Drug/No Drug against Nausea/No Nausea. (4) H0: nausea score is independent of drug use; H1: nausea score is influenced by drug use. (5) The population of people having laparoscopic gallbladder removals who are treated for nausea with Zofran. The population of people having laparoscopic gallbladder removals who are treated for nausea with a placebo. (6) My samples of treated and untreated patients are randomly selected from patients who present for laparoscopic gallbladder removal. (7) A search of the literature did not indicate any proclivity to nausea by particular subpopulations. (8) These steps seem to be consistent. (9) A chi-square test of the contingency table is appropriate; let us use α = 0.05. (10) (Methodology for step 10 is not given in Part I; Chapter 22 provides information for the student who wishes to pursue this.)

5.11.

(1) Are our clinic's INR readings different from those of the laboratory? (2) Difference between clinic and laboratory readings. (3) Mean of difference. (4) H0: mean difference = 0; H1: mean difference ≠ 0. (5) Population: All patients, past and future, subject to the current INR evaluation methods in our Coumadin Clinic. (6) Sample: the 104 consecutive patients taken in this collection. (7) Biases: The readings in this time period might not be representative. We can examine records to search for any cause of nonrepresentativeness. Also, we could test a small sample from a different time and test it for equivalence. (8) Recycle: These steps seem consistent. (9) Statistical test and α: paired t test of mean difference against zero with α = 0.05.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780120887705500502

Statistics as Inductive Inference

Jan-Willem Romeijn, in Philosophy of Statistics, 2011

7 Bayesian Statistics

The defining characteristic of Bayesian statistics is that probability assignments do not just range over data, but that they can also take statistical hypotheses as arguments. As will be seen in the following, Bayesian inference is naturally represented in terms of a non-ampliative inductive logic, and it also relates very naturally to Carnapian inductive logic.

Let Hbe the space of statistical hypotheses hθ, and let Q be the sample space as before. The functions P are probability assignments over the entire space H × Q. Since hθ is a member of the combined algebra, it makes sense to write P (st|hθ) instead of the Phθ(st) written in the context of classical statistics. We can define Bayesian statistics as follows.

DEFINITION 3 Bayesian Statistical Inference. Assume the prior probability P(hθ) assigned to hypotheses hθ ∈ H, with θ ∈ Θ, the space of parameter values. Further assume P(st|hθ), the probability assigned to the data st conditional on the hypotheses, called the likelihoods. Bayes’ theorem determines that

(9)P(hθ|st)=P(hθ)P( st|hθ)P(st).

Bayesian statistics outputs the posterior probability assignment, P(hθ|st). See [Barnett, 1999] and [Press, 2003] for a more detailed discussion. The further results form a Bayesian inference, such as estimations and measures for the accuracy of the estimations, can all be derived from the posterior distribution over the statistical hypotheses.

In this definition the probability of the data P(st) is not presupposed, because it can be computed from the prior and the likelihoods by the law of total probability,

P(st)=∫ΘP(hθ)P(st|hθ)dθ.

The result of a Bayesian statistical inference is not always a posterior probability. Often the interest is only in comparing the ratio of the posteriors of two hypotheses. By Bayes’ theorem we have

P(hθ|st)P(hθ′|st) =P(hθ)P(st|hθ)P(hθ′)P(st|hθ′),

and if we assume equal priors P(hθ) = P(hθ′), we can use the ratio of the likelihoods of the hypotheses, the so-called Bayes factor, to compare the hypotheses.

Let me give an example of a Bayesian procedure. Consider the hypotheses of Equation (3), concerning the fraction of green pears in Emma's orchard. Instead of choosing among them on the basis of the data, assign a so-called Beta-distribution over the range of hypotheses,

(10)P(hθ)∝θλ/2−1(1−θ)λ/2−1

with θ ∈ Θ = [0,1]. For λ = 2, this function is uniform over the domain. Now say that we obtain a certain sequence of pears, s000101. By the likelihood of the hypotheses as given in Equation (4), we can derive

P(hθ|s000101)=θλ /2+1(1−θ)λ/2+3.

More generally, the likelihood function for the data st with numbers tk of earlier instances qik is θt1(1− θ)t0 , so that

(11)P(hθ|st) ∝θλ/2−1+t1(1−θ)λ/ 2−1+t0.

is the posterior distribution over the hypotheses. This posterior is derived by the axioms of probability theory alone, specifically by Bayes’ theorem.

As said, capturing this statistical procedure in a non-ampliative inference is relatively straightforward. The premises are the prior over the hypotheses, P(hθ) for θ ∈ Θ, and the likelihood functions, P(st|hθ) over the algebras Q, which are determined for each hypothesis hθ separately. These premises are such that only a single probability assignment over the space H × Q remains. In other words, the premises have a unique probability model. Moreover, all the conclusions are straightforward consequences of this probability assignment. They can be derived from the assignment by applying theorems of probability theory, primarily Bayes’ theorem.

Before turning to the relation of Bayesian inference with Carnapian logic, let me compare it to the classical procedures sketched in the foregoing. In all cases, we consider a set of statistical hypotheses, and in all cases our choice among these is informed by the probability of the data according to the hypotheses. The difference is that in the two classical procedures, this choice is absolute: acceptance, rejection, and the appointment of a best estimate. In the Bayesian procedure, by contrast, all this is expressed in a posterior probability assignment over the set of hypotheses.

Note that this posterior over hypotheses can be used to generate the kind of choices between hypotheses that classical statistics provides. Consider Fisherian parameter estimation. We can use the posterior to derive an expectation for the parameter θ, as follows:

(12)E[θ]=∫ΘθP(hθ|st)dθ.

Clearly, E[θ] is a function that brings us from the hypotheses hθ and the data st to a preferred value for the parameter. The function depends on the prior probability over the hypotheses, but it is in a sense analogous to the maximum likelihood estimator. In analogy to the confidence interval, we can also define a so-called credal interval from the posterior probability distribution:

Cred1−ɛ={θ:|θ−E[θ]|

This set of values for θ is such that the posterior probability of the corresponding hθ jointly add up to 1 − ɛ of the total posterior probability.

Most of the controversy over the Bayesian method concerns the determination and interpretation of the probability assignment over hypotheses. As for interpretation, classical statistics objects to the whole idea of assigning probabilities to hypotheses. The data have a well-defined probability, because they consist of repeatable events, and so we can interpret the probabilities as frequencies, or as some other kind of objective probability. But the probability assigned to a hypothesis cannot be understood in this way, and instead expresses an epistemic state of uncertainty. One of the distinctive features of classical statistics is that it rejects such epistemic probability assignments, and that it restricts itself to a straightforward interpretation of probability as relative frequency.

Even if we buy into this interpretation of probability as epistemic uncertainty, how do we determine a prior probability? At the outset we do not have any idea of which hypothesis is right, or even which hypothesis is a good candidate. So how are we supposed to assign a prior probability to the hypotheses? The literature proposes several objective criteria for filling in the priors, for instance by maximum entropy or by other versions of the principle of indifference, but something of the subjectivity of the starting point remains. The strength of the classical statistical procedures is that they do not need any such subjective prior probability.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780444518620500241

Is a statement that there is no difference between a parameter?

A null hypothesis refers to a hypothesis that states that there is no relationship between two population parameters.

What hypothesis states that there is a difference between a parameter?

The alternative hypothesis states that a population parameter does not equal a specified value.

What is the difference between the null hypothesis value and the true value of a parameter called?

The effect size is the difference between the true value and the value specified in the null hypothesis. For example, suppose the null hypothesis states that a population mean is equal to 100.

What is the difference between null and alternative hypothesis?

Null and alternative hypotheses are used in statistical hypothesis testing. The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.