Home > Estimation > Probability density estimation as classification

## Probability density estimation as classification

August 1, 2009

Perhaps it has always been obvious to exalted statistical minds that density estimation can be viewed as classification and (perhaps) done using a classifier.

Assume that we have samples $\{x_i\}_{i=1,\ldots,N}$ of a random vector $X$ whose distribution has bounded support. In fact, without loss of generality, let the support be the unit hypercube $[0,1]^d$. We are required to estimate $P_X(x)$ the density of $X$.

Now assume that we generate a bunch of samples $\{z_i\}_{i=1,\ldots,M}$ uniformly distributed in $[0,1]^d$. We assign a label $y = 1$ to all the samples $x_i$ and a label $y = 0$ to all $z_i$ and a build a classifier $\psi$ between the two sample sets. In other words we construct an estimate $P_\psi(y=1|x)$ of the posterior class probability $P(y=1|x)$ $\forall x \in [0,1]^d$.

Now, we know that

where $U(x) = 1$, the uniform distribution over the unit hypercube. The above equation can be solved for $P_X(x)$ to obtain an estimate

$\hat{P}_X(x)=\frac{M}{N}\frac{P_\psi(y=1|x)}{P_\psi(y=0|x)}$

Because $M$ is in our control, ideally we would like to obtain

$\hat{P}_X(x)=\frac{1}{N} \mbox{lim}_{M \rightarrow \infty}\{ \frac{M . P_\psi(y=1|x)}{P_\psi(y=0|x)}\}$

The question is, because we know the distribution of the samples for class 0 (uniform!), for any particular classifier (say the Gaussian process classifier or logistic regression)  can the limit be computed/approximated without actually sampling and then learning?

This paper (which I haven’t yet read) may be related.

Update Aug 24, 2009. The uniform distribution can be substituted by any other proposal distribution from which we can draw samples and which has a support that includes the support of the density we wish to estimate. George, thanks for pointing this out.