# Bayesian numerical analysis

Distributing points $\{x_i\}_{i=1}^n$ on the sphere as to minimize the mean square error

$\mathbb{E}\left[\left(q_n(f) - \int_{\mathbb{S}^2}f(s)\,ds\right)^2\right]$

of the quadrature formula $q_n(f) =\frac{1}{n}\sum_{i=1}^n f(x_i)$, where $f$ is a centered Gaussian process with covariance function $C(x,y) = \exp(\langle x, y \rangle)$. Shown is $n=6, 12, 23$.

# Introduction

Le problème est de calculer $I(f) = \int f d\lambda,$$\lambda$ est une mesure de probabilité sur un espace $X$ et $f : X \rightarrow \mathbb{R}$ est intégrable. Si $\{X_n\}$ est une suite de variables aléatoires indépendantes et distribuées selon $\mu$, alors on peut approximer $I(f)$ par

$I_n(f) = \frac{1}{n}\sum_{i=1}^n f(X_i),$

qui est dit un estimateur de Monte-Carlo.
En pratique, il peut être difficile de générer $X_n \sim \lambda$. On préférera alors introduire une mesure $\mu$, avec $\lambda$ absolument continue par rapport à $\mu$, de sorte que

$I_n(f;\mu) = \frac{1}{n}\sum_{i=1}^n f(Y_i) \tfrac{d\lambda}{d\mu}(Y_i), \quad Y_i \sim^{ind.} \mu,$

soit une estimée de $I(f)$ plus commode à calculer. Cette technique, dite de l’échantillonnage préférentiel, peut aussi servir à améliorer la qualité de l’estimateur $I_n$ par exemple en réduisant sa variance.Read More »

# The basics

Amino acids are small molecules of the form

where $R$ is a side chain called the $R$-group. There are 20 different amino acids found in proteins, each characterized by its $R$-group.

Peptides and proteins are chains of amino acids. Proteins are long such chains, whereas peptides and polypeptides are shorter ones. The amino acids are linked together by peptide bonds:

# Combinatorics of Phylogenetic Trees

The following is based on a weekend project that I also presented as a short talk in an undergraduate combinatorics seminar. The project is self-contained and mostly based on independent work. Ideas and inspiration came from discussions with my teacher and from the introduction of Diaconis and Holmes (1998). Theorem 2 is from Semple and Steel (2003). Tree pictures were produced with Sagemath and Latex.

French pdf.

# 1. Introduction

A phylogenetic tree is a rooted binary tree with labeled leaves.

These trees are used in biology to represent the evolutive history of species. The leaves are the identified species, the root is a common anscestor, and branching represents speciation.

An interesting problem is that of reconstructing the phylogenetic tree that best explains the observed biological characterics of a set of species. A naive mathematical formulation of this problem is proposed in section 4, and used to implement a tree reconstruction algorithm.

# Loomis-Whitney type inequality for quasi-balls?

Consider the problem of estimating the volume of a tumor, given X-ray scans along orthogonal axes. It may be known that the tumor has a somewhat spherical shape. To formalise this idea, let $T \subset \mathbb{R}^3$ be the tumor, $s$ the area of its surface, $m$ its volume and $C = s^3/m^2$. From the isoperimetric inequality, we have $C \geq 6^2 \pi$, with equality iff $T$ is a ball. Correspondingly, we say that $T$ is a quasi-ball if $C \approx 6^2 \pi$. In reality, $C$ is unknown but its distribution may be determined.

We are now given the areas $m_1$, $m_2$ and $m_3$ of the projections of $T$ along orthogonal axes. From the Loomis-Whitney inequality (or Cauchy-Schwarz in this case), we have the following estimate of the volume $m$ of $T$.

Theorem.
We have

$\max_i \sqrt{\frac{2^3 m_i^3}{C}} \le m \le \sqrt{m_1 m_2 m_3}$.

Problem.
Can we find such an estimate of $m$ that is close to sharp when $T$ is close to a ball?

References.
Loomis, L. H.; Whitney, H. An inequality related to the isoperimetric inequality. Bull. Amer. Math. Soc. 55 (1949), no. 10, 961–962. http://projecteuclid.org/euclid.bams/1183514163.