AccessMyLibrary : Search Information that Libraries Trust AccessMyLibrary | News, Research, and Information that Libraries Trust

AccessMyLibrary    Browse    J    Journal of Computational & Graphical Statistics    Simultaneous posterior probability statements from Monte Carlo output.

Simultaneous posterior probability statements from Monte Carlo output.

Publication: Journal of Computational & Graphical Statistics

Publication Date: 01-MAR-04

Author: Held, Leonhard
How to access the full article: Free access to all articles is available courtesy of your local library. To access the full article click the "See the full article" button below. You will need your US library barcode or password.

Bookmark this article

Print this article

Link to this article

Email this article

Digg It!

Add to del.icio.us

RSS

COPYRIGHT 2004 American Statistical Association

This article considers the problem of making simultaneous probability statements in multivariate inferential problems based on samples from a posterior distribution. The calculation of simultaneous credible bands is reviewed and--as an alternative--contour probabilities are proposed. These are defined as 1 minus the content of the highest posterior density region which just covers a certain point of interest. We discuss a Monte Carlo method to estimate contour probabilities and distinguish whether of not the functional form of the posterior density is available. In the latter case, an approach based on Rao-Blackwellization is proposed. We highlight that this new estimate has an important invariance property. We illustrate the performance of the different methods in three applications.

Key Words: Contour probability; Highest posterior density region; Monte Carlo; Rao-Blackwell; Simultaneous credible bands; Simultaneous inference.

1. INTRODUCTION

Bayesian hierarchical models are nowadays routinely used to analyze complex problems. Monte Carlo and in particular Markov chain Monte Carlo (MCMC) methods have proven to be the main tool for statistical inference in this class. These methods generate samples from the posterior distribution, which can then be further exploited to calculate posterior summaries.

Typically results are reported using summary statistics of univariate marginal distributions, such as the posterior mean, median, or posterior quantiles. The latter can of course be used to calculate (pointwise) credible intervals. Posterior probabilities may also be reported, for example, one might be interested in the posterior probability that a certain regression parameter is positive.

Summaries of posterior quantities that involve more than one parameter are not as common, but examples exist. For example, one might be interested in the posterior probability that one parameter is larger than a second one. More generally, some functions of the parameters may be of interest, such as posterior ranks, for example (Goldstein and Spiegelhalter 1996).

However, there seems to be a lack of possibilities to calculate simultaneous probability statements relating to a, potentially large, number of parameters. A notable exception was described by Besag, Green, Higdon, and Mengersen (1995, p. 30), a method to calculate simultaneous credible bands based on order statistics. Their approach defines such a band as the product of symmetric univariate posterior credible intervals (of the same univariate level) for each parameter; the simultaneous credible level is then essentially defined as the proportion of samples which fall simultaneously in all intervals. Being based only on ranks, the method is invariant to monotonic transformations of the variables. However, a difficulty is that the simultaneous credible band is defined as a product of univariate intervals, hence is by construction restricted to be hyperrectangular. We will review the method by Besag et al. (1995) in Section 2.1.

Often there is interest in determining if a certain point in the parameter space is supported by the posterior distribution. For example, in hierarchical models with latent Gaussian random effects having prior mode zero, one may ask if the posterior distribution of all random effects is "significantly" different from the zero vector. An initial approach would be to investigate the corresponding univariate credible intervals of some predefined level separately, but it is unclear how this translates to a simultaneous probability statement, if the random effects are dependent. Alternatively, one could calculate simultaneous credible bands on various levels and determine the smallest level, such that the point of interest is contained in the simultaneous credible band. We will test this approach in several applications.

As a promising alternative, we propose to use (multivariate) highest posterior density (HPD) regions to check the support of the posterior distribution for a certain parameter vector of interest. We need some notation to establish our ideas further. Let y be the observed data and [theta] [member of] [R.sup.p] be the unknown parameter vector of interest. The total number of parameters in the model may often exceed p, the posterior density p([theta]|y) is then of course obtained by suitable marginalization. Let r/denote these "nuisance" parameters, with [eta] [member of] [R.sup.q], say.

We may now follow Box and Tiao (1973, p. 125) and ask whether a specific parameter point of interest [[theta].sub.0] lies inside or outside a HPD region with some predefined (simultaneous) credible level 1 - [alpha]. This is the case, if and only if

P{p([theta]|y) > p([[theta].sub.0]|y)|y} [greater than or equal to] 1 - [alpha].

Notice that the posterior density p([theta]|y) is treated here as a random variable, a function of the random vector [theta]|y.

This question can now easily be reversed to define the (posterior) contour probability P([[theta].sub.0]|y) of [[theta].sub.0] as 1 minus the content of the HPD region of p([theta]|y) which just covers [[theta].sub.0]:

(1.1) P([[theta.sub.0]|y) = P{p([theta]|y) [less than or equal to] p([[theta].sub.0]|y)|y}.

Box and Tiao (1973) did not explicitly define contour probabilities, but made an identical reverse assessment ("what is the posterior evidence against a given point based on HPD regions," Example, 2.12.4, pp. 136-138). An alternative name for a contour probability could be posterior or Bayesian p value; motivated by well-known analogies between Bayesian and classical inference concepts, as reviewed thoroughly by Box and Tiao (1973). However, Bayesian p values have been defined in many different ways; see, for example, Gelman, Carlin, Stern, and Rubin (1995) or Carlin and Louis (2000), so we stick to our less controversial terminology.

Section 2.2 describes methods for Monte Carlo estimation of contour probabilities based on samples from...

Read the full article for free courtesy of your local library.


More Articles from Journal of Computational & Graphical Statistics
Likelihood estimation and inference for the autologistic model.
March 01, 2004
Nonparametric Bayesian assessment of the order of dependence for binar...
March 01, 2004
Bayesian P-splines.
March 01, 2004
A split-merge Markov chain Monte Carlo procedure for the Dirichlet pro...
March 01, 2004
Bayesian variable selection and the Swendsen-Wang algorithm.
March 01, 2004

What's on AccessMyLibrary?

32,075,336 articles
in the following categories:

Arts, Business, Consumer News, Culture & Society, Education, Government, Personal Interest, Health, News, Science & Technology


© 2008 Gale, a part of Cengage Learning  | All Rights Reserved | About this Service | About The Gale Group, a part of Cengage Learning
                                            Privacy Policy | Site Map | Content Licensing | Contact Us | Link to us
      Other Gale sites: Books & Authors | Goliath | MovieRetriever.com | WiseTo Social Issues