Bhattacharya, Pati & Yan (2016) wrote an interesting paper on Bayesian fractional posteriors. These are based on fractional likelihoods – likelihoods raised to a fractional power – and provide robustness to misspecification. One of their results shows that fractional posterior contraction can be obtained as only a function of prior mass attributed to neighborhoods, in a sort of Kullback-Leibler sense, of the parameter corresponding to the true data generating distribution (or the one closest to it in the Kullback-Leibler sense). With regular posteriors, on the other hand, a complexity constraint on the prior distribution is usually also required in order to show posterior contraction.

Their result made me think of the approach of Xing & Ranneby (2008) to posterior consistency. Therein, a prior complexity constraint specified through the so-called Hausdorff -entropy is used to allow bounding the regular posterior distribution by something that is similar to a fractional posterior distribution. As it turns out, the proof of Theorem 3.2 of of Battacharya & al. (2016) can almost directly be adapted to regular posteriors in certain cases, using the Hausdorff -entropy to bridge the gap. Let me explain this in some more detail.

Le me consider well-specified discrete priors for simplicity. More generally, the discretization trick could possibly yield similar results for non-discrete priors.

I will follow as closely as possible the notations of Battacharya & al. (2016). Let be a dominated statistical model, where is discrete. Assume for some , let

and define the Renyi divergence of order as

We let be a prior on and its fractional posterior distribution of order is defined as

In this well-specified case, one of their result is the following:

**Theorem 3.2 of Bhattacharya & al. (particular case)**

*Fix and assume that satisfies and*

*Then, for any and ,*

*holds with probability at least .*

## What about regular posteriors?

Let us define the -entropy of the prior as

An adaptation of the proof of the previous Theorem, in our case where is discrete, yields the following.

**Proposition (Regular posteriors)**

*Fix and assume that satisfies and*

*Then, for any and ,*

*holds with probability at least .*

Note that may be infinite, in which case the upper bound on the tails of is trivial. When the prior is not discrete, my guess is that the complexity term should be replaced by a discretization entropy which is the -entropy of a discretized version of whose resolution (in the Hellinger sense) is some function of .

Read More »