Home > Estimation > Probability density estimation as classification

Probability density estimation as classification

August 1, 2009

Perhaps it has always been obvious to exalted statistical minds that density estimation can be viewed as classification and (perhaps) done using a classifier.

Assume that we have samples \{x_i\}_{i=1,\ldots,N} of a random vector X whose distribution has bounded support. In fact, without loss of generality, let the support be the unit hypercube [0,1]^d. We are required to estimate P_X(x) the density of X.

Now assume that we generate a bunch of samples \{z_i\}_{i=1,\ldots,M} uniformly distributed in [0,1]^d. We assign a label y = 1 to all the samples x_i and a label y = 0 to all z_i and a build a classifier \psi between the two sample sets. In other words we construct an estimate P_\psi(y=1|x) of the posterior class probability P(y=1|x) \forall x \in [0,1]^d.

Now, we know that


where U(x) = 1, the uniform distribution over the unit hypercube. The above equation can be solved for P_X(x) to obtain an estimate


Because M is in our control, ideally we would like to obtain

\hat{P}_X(x)=\frac{1}{N} \mbox{lim}_{M \rightarrow \infty}\{ \frac{M . P_\psi(y=1|x)}{P_\psi(y=0|x)}\}

The question is, because we know the distribution of the samples for class 0 (uniform!), for any particular classifier (say the Gaussian process classifier or logistic regression)  can the limit be computed/approximated without actually sampling and then learning?

This paper (which I haven’t yet read) may be related.

Update Aug 24, 2009. The uniform distribution can be substituted by any other proposal distribution from which we can draw samples and which has a support that includes the support of the density we wish to estimate. George, thanks for pointing this out.

Categories: Estimation
%d bloggers like this: