Nested sampling algorithm

The nested sampling algorithm is a computational approach to the problem of comparing models in Bayesian statistics, developed in 2004 by physicist John Skilling. The nested sampling algorithm is a computational approach to the problem of comparing models in Bayesian statistics, developed in 2004 by physicist John Skilling. Bayes' theorem can be applied to a pair of competing models M 1 {displaystyle M_{1}} and M 2 {displaystyle M_{2}} for data D {displaystyle D} , one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for M 1 {displaystyle M_{1}} may be calculated as: Given no a priori information in favor of M 1 {displaystyle M_{1}} or M 2 {displaystyle M_{2}} , it is reasonable to assign prior probabilities P ( M 1 ) = P ( M 2 ) = 1 / 2 {displaystyle P(M_{1})=P(M_{2})=1/2} , so that P ( M 2 ) / P ( M 1 ) = 1 {displaystyle P(M_{2})/P(M_{1})=1} . The remaining Bayes factor P ( D ∣ M 2 ) / P ( D ∣ M 1 ) {displaystyle P(Dmid M_{2})/P(Dmid M_{1})} is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally, M 1 {displaystyle M_{1}} has a set of parameters that can be grouped together and called θ {displaystyle heta } , and M 2 {displaystyle M_{2}} has its own vector of parameters that may be of different dimensionality, but is still termed θ {displaystyle heta } . The marginalization for M 1 {displaystyle M_{1}} is and likewise for M 2 {displaystyle M_{2}} . This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution P ( θ ∣ D , M 1 ) {displaystyle P( heta mid D,M_{1})} . It is an alternative to methods from the Bayesian literature such as bridge sampling and defensive importance sampling. Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density Z = P ( D ∣ M ) {displaystyle Z=P(Dmid M)} where M {displaystyle M} is M 1 {displaystyle M_{1}} or M 2 {displaystyle M_{2}} : At each iteration, X i {displaystyle X_{i}} is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than θ i {displaystyle heta _{i}} . The weight factor w i {displaystyle w_{i}} is an estimate of the amount of prior mass that lies between two nested hypersurfaces { θ ∣ P ( D ∣ θ , M ) = P ( D ∣ θ i − 1 , M ) } {displaystyle { heta mid P(Dmid heta ,M)=P(Dmid heta _{i-1},M)}} and { θ ∣ P ( D ∣ θ , M ) = P ( D ∣ θ i , M ) } {displaystyle { heta mid P(Dmid heta ,M)=P(Dmid heta _{i},M)}} . The update step Z := Z + L i w i {displaystyle Z:=Z+L_{i}w_{i}} computes the sum over i {displaystyle i} of L i w i {displaystyle L_{i}w_{i}} to numerically approximate the integral In the limit j → ∞ {displaystyle j o infty } , this estimator has a positive bias of order 1 / N {displaystyle 1/N} which can be removed by using ( 1 − 1 / N ) {displaystyle (1-1/N)} instead of the exp − 1 / N {displaystyle exp -1/N} in the above algorithm. The idea is to subdivide the range of f ( θ ) = P ( D ∣ θ , M ) {displaystyle f( heta )=P(Dmid heta ,M)} and estimate, for each interval [ f ( θ i − 1 ) , f ( θ i ) ] {displaystyle } , how likely it is a priori that a randomly chosen θ {displaystyle heta } would map to this interval. This can be thought of as a Bayesian's way to numerically implement Lebesgue integration. Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several programming languages.

Parent Topic

Child Topic

No Parent Topic