# What this blog is about

My professional page is at olivierbinette.ca.

# 3D Data Visualization with WebGL/three.js

I wanted to make a web tool for high-dimensional data exploration through spherical multidimensional scaling (S-MDS). The basic idea of S-MDS is to map a possibly high-dimensional dataset on the sphere while approximately preserving a matrix of pairwise distances (or divergences). An interactive visualization tool could help explore the mapped dataset and translate observations back to the original data domain. I’m not quite finished, but I made a frontend prototype. The next step would be to implement the multidimensional scaling algorithm in Javascript. I may get to this if I find the time.

In the current applet, you can visualize the positions and depths of earthquakes of magnitude greater than 6 from January 1st 2014 up to January 1st 2019. Data is from the US Geological Survey (usgs.gov). Code is on GitHub.

# Fractional Posteriors and Hausdorff alpha-entropy

Bhattacharya, Pati & Yan (2016) wrote an interesting paper on Bayesian fractional posteriors. These are based on fractional likelihoods – likelihoods raised to a fractional power – and provide robustness to misspecification. One of their results shows that fractional posterior contraction can be obtained as only a function of prior mass attributed to neighborhoods, in a sort of Kullback-Leibler sense, of the parameter corresponding to the true data generating distribution (or the one closest to it in the Kullback-Leibler sense). With regular posteriors, on the other hand, a complexity constraint on the prior distribution is usually also required in order to show posterior contraction.

Their result made me think of the approach of Xing & Ranneby (2008) to posterior consistency. Therein, a prior complexity constraint specified through the so-called Hausdorff $\alpha$-entropy is used to allow bounding the regular posterior distribution by something that is similar to a fractional posterior distribution. As it turns out, the proof of Theorem 3.2 of of Battacharya & al. (2016) can almost directly be adapted to regular posteriors in certain cases, using the Hausdorff $\alpha$-entropy to bridge the gap. Let me explain this in some more detail.

Le me consider well-specified discrete priors for simplicity. More generally, the discretization trick could possibly yield similar results for non-discrete priors.

I will follow as closely as possible the notations of Battacharya & al. (2016). Let $\{p_{\theta}^{(n)} \mid \theta \in \Theta\}$ be a dominated statistical model, where $\Theta = \{\theta_1, \theta_2, \theta_3, \dots\}$ is discrete. Assume $X^{(n)} \sim p_{\theta_0}^{(n)}$ for some $\theta_0 \in \Theta$, let

$B_n(\varepsilon, \theta_0) = \left\{ \int p_{\theta_0}^{(n)}\log\frac{p_{\theta_0}^{(n)}}{p_{\theta}^{(n)}} < n\varepsilon^2,\, \int p_{\theta_0}^{(n)}\log^2\frac{p_{\theta_0}^{(n)}}{p_{\theta}^{(n)}} < n\varepsilon^2 \right\}$

and define the Renyi divergence of order $\alpha$ as

$D^{(n)}_{\alpha}(\theta, \theta_0) = \frac{1}{\alpha-1}\log\int\{p_{\theta}^{(n)}\}^\alpha \{p_{\theta_0}^{(n)}\}^{1-\alpha}.$

We let $\Pi_n$ be a prior on $\Theta$ and its fractional posterior distribution of order $\alpha$ is defined as

$\Pi_{n, \alpha}(A \mid X^{(n)}) \propto \int_{A}p_{\theta}^{(n)}\left(X^{(n)}\right)^\alpha\Pi_n(d\theta)$

In this well-specified case, one of their result is the following:

Theorem 3.2 of Bhattacharya & al. (particular case)
Fix $\alpha \in (0,1)$ and assume that $\varepsilon_n$ satisfies $n\varepsilon_n^2 \geq 2$ and

$\Pi_n(B_n(\varepsilon_n, \theta_0)) \geq e^{-n\varepsilon_n^2}.$

Then, for any $D \geq 2$ and $t > 0$,

$\Pi_{n,\alpha}\left( \frac{1}{n}D_\alpha^{(n)}(\theta, \theta_0) \geq \frac{D+3t}{1-\alpha} \varepsilon_n^2 \mid X^{(n)} \right) \leq e^{-tn\varepsilon_n^2}.$

holds with probability at least $1-2/\{(D-1+t)^2n\varepsilon_n^2\}$.

Let us define the $\alpha$-entropy of the prior $\Pi_n$ as

$H_\alpha(\Pi_n) = \sum_{\theta \in \Theta} \Pi_n(\theta)^\alpha.$

An adaptation of the proof of the previous Theorem, in our case where $\Pi_n$ is discrete, yields the following.

Proposition (Regular posteriors)
Fix $\alpha \in (0,1)$ and assume that $\varepsilon_n$ satisfies $n\varepsilon_n^2 \geq 2$ and

$\Pi_n(B_n(\varepsilon_n, \theta_0)) \geq e^{-n\varepsilon_n^2}.$

Then, for any $D \geq 2$ and $t > 0$,

$\Pi_{n}\left( \frac{1}{n}D_\alpha^{(n)}(\theta, \theta_0) \geq \frac{D+3t}{1-\alpha} \varepsilon_n^2 \mid X^{(n)} \right)^\alpha \leq H_\alpha(\Pi_n) e^{-tn\varepsilon_n^2}.$

holds with probability at least $1-2/\{(D-1+t)^2n\varepsilon_n^2\}$.

Note that $H_\alpha(\Pi_n)$ may be infinite, in which case the upper bound on the tails of $\frac{1}{n}D_\alpha^{(n)}$ is trivial. When the prior is not discrete, my guess is that the complexity term $H_\alpha(\Pi_n)$ should be replaced by a discretization entropy ${}H_\alpha(\Pi_n; \varepsilon_n)$ which is the $\alpha$-entropy of a discretized version of $\Pi_n$ whose resolution (in the Hellinger sense) is some function of $\varepsilon_n$.

# ISM at the Eureka! science festival

We’ve been hard at work getting ready for the Eureka! science festival held this weekend at the Montreal Science Centre. Come check it out!

At the festival:

# Topology and Machine Learning

Won best poster award at the CSSC.

par(mfcol=c(1,2)) plot(cars); pretty_plot(cars)