# The discretization trick

I explain the discretization trick that I mentioned in my previous post (Posterior consistency under possible misspecification). I think it was introduced by Walker (New approaches to Bayesian consistency (2004)).

Let $\mathbb{F}$ be a set of densities and let $\Pi$ be a prior on $\mathbb{F}$. If $x_1, x_2, x_3, \dots$ follows some distribution $P_0$ having density $f_0$, then the posterior distribution of $\Pi$ can be written as

$\Pi(A | x_1, \dots, x_n) \propto \int_A \prod_{i=1}^n f(x_i) \Pi(df).$

The discretization trick is to find densities $f_{1}, f_2, f_3, \dots$ in the convex hull of $A$ (taken in the space of all densities) such that

$\int_A \prod_{i=1}^n f(x_i) = \prod_{i=1}^n f_i(x_i) \Pi(A).$

For example, suppose $\varepsilon > 2\delta > 0$, $A = A_\varepsilon = \{f \,|\, D_{\frac{1}{2}}(f_0, f) > \varepsilon\}$ and that $A_i$ is a partition of $A$ of diameter at most $\delta$. If there exists $1 > \alpha > 0$ such that

$\sum_i \Pi(A_i)^\alpha < \infty,$

then for some $\beta > 0$ we have that

$e^{n \beta} \left(\int_{A_{\varepsilon}} \prod_{i=1}^n \frac{f(x_i)}{f_0(x_i)} \Pi (df) \right)^\alpha \le e^{n \beta} \sum_i \prod_{j=1}^n \left(\frac{f_{i,j}(x_j)}{f_0(x_j)}\right)^\alpha \Pi(A_i)^\alpha \rightarrow 0$

almost surely. This is because, with $A_{\alpha}$ the $\alpha$-affinity defined here, we have that

goes exponentially fast towards 0 when $\beta$ is sufficiently small.  Hence the Borel-Cantelli lemma applies, yielding the claim.

## Construction

The $f_{i,j}$‘s are defined as the posterior mean predictive density, when the posterior is conditioned on $A_i$. That is,

$f_{i,j} : x \mapsto \frac{\int_{A_i} f(x) \prod_{k=1}^{j-1}f(x_k) \Pi(df)}{\int_{A_i} \prod_{k=1}^{j-1}f(x_k) \Pi(df)}$

and

$f_{i, 1} : x \mapsto \frac{\int_{A_i} f(x) \Pi(df)}{\Pi(A_i)}.$

Clearly

$\int_{A_i} \prod_{i=1}^n f(x_i) \Pi(df) = \prod_{j=1}^n f_{i,j}(x_j) \Pi(A_i).$

Furthermore, if $A_i$ is contained in a Hellinger ball of center $g_i$ and of radius $\delta$, then also

$H(f_{i,j}, g_i) < \delta.$

This follows form the convexity of the Hellinger balls (an important remark for the generalization of this trick).