|
COPYRIGHT 2004 American Statistical Association
This article considers the problem of making simultaneous probability statements in multivariate inferential problems based on samples from a posterior distribution. The calculation of simultaneous credible bands is reviewed and--as an alternative--contour probabilities are proposed. These are defined as 1 minus the content of the highest posterior density region which just covers a certain point of interest. We discuss a Monte Carlo method to estimate contour probabilities and distinguish whether of not the functional form of the posterior density is available. In the latter case, an approach based on Rao-Blackwellization is proposed. We highlight that this new estimate has an important invariance property. We illustrate the performance of the different methods in three applications.
Key Words: Contour probability; Highest posterior density region; Monte Carlo; Rao-Blackwell; Simultaneous credible bands; Simultaneous inference.
1. INTRODUCTION
Bayesian hierarchical models are nowadays routinely used to analyze complex problems. Monte Carlo and in particular Markov chain Monte Carlo (MCMC) methods have proven to be the main tool for statistical inference in this class. These methods generate samples from the posterior distribution, which can then be further exploited to calculate posterior summaries.
Typically results are reported using summary statistics of univariate marginal distributions, such as the posterior mean, median, or posterior quantiles. The latter can of course be used to calculate (pointwise) credible intervals. Posterior probabilities may also be reported, for example, one might be interested in the posterior probability that a certain regression parameter is positive.
Summaries of posterior quantities that involve more than one parameter are not as common, but examples exist. For example, one might be interested in the posterior probability that one parameter is larger than a second one. More generally, some functions of the parameters may be of interest, such as posterior ranks, for example (Goldstein and Spiegelhalter 1996).
However, there seems to be a lack of possibilities to calculate simultaneous probability statements relating to a, potentially large, number of parameters. A notable exception was described by Besag, Green, Higdon, and Mengersen (1995, p. 30), a method to calculate simultaneous credible bands based on order statistics. Their approach defines such a band as the product of symmetric univariate posterior credible intervals (of the same univariate level) for each parameter; the simultaneous credible level is then essentially defined as the proportion of samples which fall simultaneously in all intervals. Being based only on ranks, the method is invariant to monotonic transformations of the variables. However, a difficulty is that the simultaneous credible band is defined as a product of univariate intervals, hence is by construction restricted to be hyperrectangular. We will review the method by Besag et al. (1995) in Section 2.1.
Often there is interest in determining if a certain point in the parameter space is supported by the posterior distribution. For example, in hierarchical models with latent Gaussian random effects having prior mode zero, one may ask if the posterior distribution of all random effects is "significantly" different from the zero vector. An initial approach would be to investigate the corresponding univariate credible intervals of some predefined level separately, but it is unclear how this translates to a simultaneous probability statement, if the random effects are dependent. Alternatively, one could calculate simultaneous credible bands on various levels and determine the smallest level, such that the point of interest is contained in the simultaneous credible band. We will test this approach in several applications.
As a promising alternative, we propose to use (multivariate) highest posterior density (HPD) regions to check the support of the posterior distribution for a certain parameter vector of interest. We need some notation to establish our ideas further. Let y be the observed data and [theta] [member of] [R.sup.p] be the unknown parameter vector of interest. The total number of parameters in the model may often exceed p, the posterior density p([theta]|y) is then of course obtained by suitable marginalization. Let r/denote these "nuisance" parameters, with [eta] [member of] [R.sup.q], say.
We may now follow Box and Tiao (1973, p. 125) and ask whether a specific parameter point of interest [[theta].sub.0] lies inside or outside a HPD region with some predefined (simultaneous) credible level 1 - [alpha]. This is the case, if and only if
P{p([theta]|y) > p([[theta].sub.0]|y)|y} [greater than or equal to] 1 - [alpha].
Notice that the posterior density p([theta]|y) is treated here as a random variable, a function of the random vector [theta]|y.
This question can now easily be reversed to define the (posterior) contour probability P([[theta].sub.0]|y) of [[theta].sub.0] as 1 minus the content of the HPD region of p([theta]|y) which just covers [[theta].sub.0]:
(1.1) P([[theta.sub.0]|y) = P{p([theta]|y) [less than or equal to] p([[theta].sub.0]|y)|y}.
Box and Tiao (1973) did not explicitly define contour probabilities, but made an identical reverse assessment ("what is the posterior evidence against a given point based on HPD regions," Example, 2.12.4, pp. 136-138). An alternative name for a contour probability could be posterior or Bayesian p value; motivated by well-known analogies between Bayesian and classical inference concepts, as reviewed thoroughly by Box and Tiao (1973). However, Bayesian p values have been defined in many different ways; see, for example, Gelman, Carlin, Stern, and Rubin (1995) or Carlin and Louis (2000), so we stick to our less controversial terminology.
Section 2.2 describes methods for Monte Carlo estimation of contour probabilities based on samples from...
Read the full article for free courtesy of your local library.
|