papaya
Class Probability

java.lang.Object
  extended by papaya.Probability
All Implemented Interfaces:
PapayaConstants

public class Probability
extends Object
implements PapayaConstants

Cumulative distribution functions and corresponding inverses of certain probability distributions.

Description:
...cdf signifies the cumulative distribution function and is similar to the q... functions in R and the ...cdf functions in MATLAB. The cdf represent the integral, from -infinity to 0 of the corresponding continuous probability density function (pdf), or, the sum from 0 to x of an associated finite pdf.

...inv signifies the inverse of the cumulative distribution function and is similar to the p... functions in R and the ...inv functions in MATLAB.

Implementation:

The code is partially adapted from the CERN Jet Java libraries, which in turn was adapted from the Cephes Mathematical Library. As far a as I can tell, Stephen L. Moshier wrote the original C++ code for CEPHES (1989, moshier@na-net.ornl.gov), Gedeck (at Novartis) and Wolfgang Hoschek (at CERN, hats off to you mate!) then adapted it for the Java platform, making some significant changes along the way.
2012 rolls along, and Adila (hello!) finds the library and makes even more changes, increasing the probability (ha.ha.) of things going massively wrong (pValue<.0001).


Field Summary
 
Fields inherited from interface papaya.PapayaConstants
BASELINE, big, biginv, BOTTOM, CENTER, CORNER, FONTNAME, GRAY, INDEX_NOT_FOUND, INDICES_NOT_FOUND, LEFT, LOGPI, MACHEP, MAXGAM, MAXLOG, MINLOG, RIGHT, SQRTH, SQTPI, STROKEWEIGHT, TEXTSIZE, TOP
 
Method Summary
static double betacdf(double x, double a, double b)
          Returns the area from zero to x under the beta density function.
static double betacdfComplemented(double x, double a, double b)
          Returns the area under the right hand tail (from x to infinity) of the beta density function.
static double betainv(double p, double a, double b)
          Returns the inverse of the beta cumulative distribution function.
static double binomcdf(double p, int k, int n)
          Returns the sum of the terms 0 through k of the Binomial probability density.
static double binomcdfComplemented(double p, int k, int n)
          Returns the sum of the terms k+1 through n of the Binomial probability density.
static double binominv(double y, int k, int n)
          Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y.
static double chi2cdf(double x, double v)
          Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom.
static double chi2cdfComplemented(double x, double v)
          Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom.
static double erf(double x)
          Returns the error function of the normal distribution.
static double erfComplemented(double a)
          Returns the complementary Error function of the normal distribution; formerly named erfc.
static double fcdf(double x, int df1, int df2)
          Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density); formerly named fdtr.
static double fcdfComplemented(double x, int df1, int df2)
          Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density); formerly named fdtrc.
static double finv(double p, int df1, int df2)
          Finds the F density argument x such that the integral from 0 to x of the F density is equal to p; formerly named fdtri.
static double gammacdf(double x, double a, double b)
          Returns the integral from zero to x of the gamma probability density function.
static double gammacdfComplemented(double x, double a, double b)
          Returns the integral from x to infinity of the gamma probability density function:
static double nbinomcdf(double p, int k, int n)
          Returns the sum of the terms 0 through k of the Negative Binomial Distribution.
static double nbinomcdfComplemented(double p, int k, int n)
          Returns the sum of the terms k+1 to infinity of the Negative Binomial distribution.
static double normcdf(double a)
          Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x (assumes mean is zero, variance is one).
static double normcdf(double x, double mean, double variance)
          Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x when the data has not been standardized (i.e.
static double norminv(double y0)
          Returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one); formerly named ndtri.
static double poissoncdf(double mean, int k)
          Returns the sum of the first k terms of the Poisson distribution.
static double poissoncdfComplemented(int k, double mean)
          Returns the sum of the terms k+1 to Infinity of the Poisson distribution.
static double tcdf(double t, double k)
          Returns the integral from minus infinity to t of the Student-t distribution with k > 0 degrees of freedom.
static double tinv(double p, double k)
          Returns the value, t, for which the area under the Student-t probability density function (integrated from minus infinity to t) is equal to p.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

betacdf

public static double betacdf(double x,
                             double a,
                             double b)
Returns the area from zero to x under the beta density function.
 
 P(x) = Γ(a+b) / [ Γ(a) * Γ(b)  ] * Integral_(0 to x) [  t^(a-1) * (1-t)^(b-1) ] dt

 
This function is identical to the incomplete beta integral function Gamma.incompleteBeta(a, b, x). The complemented function is betacdfComplemented, specified by 1 - P(1-x) = Gamma.incompleteBeta( b, a, x );


betacdfComplemented

public static double betacdfComplemented(double x,
                                         double a,
                                         double b)
Returns the area under the right hand tail (from x to infinity) of the beta density function. This function is identical to the incomplete beta integral function Gamma.incompleteBeta(b, a, x).


betainv

public static double betainv(double p,
                             double a,
                             double b)
Returns the inverse of the beta cumulative distribution function. That is, given p, the function finds x such that
  Gamma.incompleteBeta( a, b, x ) = beta(x, a, b) = p;
This function is identical to the incomplete beta inverse function Gamma.incompleteBetaInverse(aa, bb, yy0).

Parameters:
a - the alpha parameter of the beta distribution.
b - the beta parameter of the beta distribution.
p - the value for which to solve for the corresponding x in the incomplete Beta integral.
Returns:
the value x such that beta( a, b, x ) = p.

binomcdf

public static double binomcdf(double p,
                              int k,
                              int n)
Returns the sum of the terms 0 through k of the Binomial probability density.
 
 Sum_(j=0 to k)  [ (n!)/(j!*(n-j)! ]  *  [ p^j * (1-p)^(n-j) ]

 
The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula

y = binomcdf( p, k, n) = Gamma.incompleteBeta( n-k, k+1, 1-p ).

All arguments must be positive,

Parameters:
k - end term.
n - the number of trials.
p - the probability of success (must be in (0.0,1.0)).

binomcdfComplemented

public static double binomcdfComplemented(double p,
                                          int k,
                                          int n)
Returns the sum of the terms k+1 through n of the Binomial probability density.
 
 Sum_(j=k+1 to n)  [ (n!)/(j!*(n-j)! ]  *  [ p^j * (1-p)^(n-j) ]

 
The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula

y = binomcdfComplemented( p, k, n ) = Gamma.incompleteBeta( k+1, n-k, p ).

All arguments must be positive,

Parameters:
k - (start-1) term.
n - the number of trials.
p - the probability of success (must be in (0.0,1.0)).

binominv

public static double binominv(double y,
                              int k,
                              int n)
Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y. That is,
 binomcdf( p, k, n ) = y;
This is accomplished using the inverse beta integral function and the relation

1 - p = Gamma.incompleteBetaInverse( n-k, k+1, y ).

All arguments must be positive.

Parameters:
k - end term.
n - the number of trials.
y - the value to solve for such that binomcdf( p, k, n ) = y.

chi2cdf

public static double chi2cdf(double x,
                             double v)
                      throws ArithmeticException
Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom.
 
 P( x | v ) = 1/ [ 2^(v/2) * Γ(v/2)  ] * Integral_(from 0 to x) [ t^(v/2-1) * e^(-t/2) ] dt
                                  
 
where x is the Chi-square variable.

The incomplete gamma integral is used, according to the formula

y = chi2cdf( x, v ) = Gamma.incompleteGamma( v/2.0, x/2.0 ).

The arguments must both be positive.

Parameters:
v - degrees of freedom.
x - integration end point.
Throws:
ArithmeticException

chi2cdfComplemented

public static double chi2cdfComplemented(double x,
                                         double v)
                                  throws ArithmeticException
Returns the area under the right hand tail (from x to infinity) of the Chi square probability density function with v degrees of freedom.
 
 P( x | v ) = 1/ [ 2^(v/2) * Γ(v/2)  ] * Integral_(from x to infinity) [ t^(v/2-1) * e^(-t/2) ] dt

 
where x is the Chi-square variable. The incomplete gamma integral is used, according to the formula y = chi2cdfComplemented( x, v ) = incompleteGammaComplement( v/2.0, x/2.0 ). The arguments must both be positive.

Parameters:
v - degrees of freedom.
x - integration start point.
Throws:
ArithmeticException

erf

public static double erf(double x)
                  throws ArithmeticException
Returns the error function of the normal distribution. The integral is
 
 erf(x) = 2/sqrt(PI) * Integral_(0 to x) [ e^(-t^2) ] dt
                           
 
Implementation: For 0 <= |x| < 1, erf(x) = x * P4(x^2)/Q5(x^2) ; otherwise erf(x) = 1 - erfc(x).

Parameters:
x - the integral end point.
Throws:
ArithmeticException

erfComplemented

public static double erfComplemented(double a)
                              throws ArithmeticException
Returns the complementary Error function of the normal distribution; formerly named erfc.
 
  erfc(x) = 2/sqrt(PI) * Integral_(x to inf) [ e^(-t^2) ] dt
                = 1 - 2/sqrt(PI) * Integral_(0 to x) [ e^(-t^2) ] dt
                = 1 - erf(x).
                          
 
Implementation: For small x, erfc(x) = 1 - erf(x); otherwise rational approximations are computed.

Parameters:
a - corresponds to x above. An internal check is performed to see if a is < 0 or otherwise. If a < 0 --> x = -a. Else, x = a.
Throws:
ArithmeticException

fcdf

public static double fcdf(double x,
                          int df1,
                          int df2)
Returns the area from zero to x under the F density function (also known as Snedcor's density or the variance ratio density); formerly named fdtr. Corresponds to the left hand tail of the F-distribution. The incomplete beta integral is used, according to the formula

 P(x) = Gamma.incompleteBeta( df1/2, df2/2, (df1*x/(df2 + df1*x) ).

 

Parameters:
df1 - ( greater than or equal to 1 )
df2 - ( greater than or equal to 1 )
x - the integration end point (greater than or equal to 0).

fcdfComplemented

public static double fcdfComplemented(double x,
                                      int df1,
                                      int df2)
Returns the area from x to infinity under the F density function (also known as Snedcor's density or the variance ratio density); formerly named fdtrc. Corresponds to the right hand tail of the F-distribution ( and to the values presented in F-distribution tables). The incomplete beta integral is used, according to the relation
 
 P(x) = Gamma.incompleteBeta( df2/2, df1/2, (df2/(df2 + df1*x) )
 
 

Parameters:
df1 - df1 ( greater than or equal to 1 )
df2 - df2 ( greater than or equal to 1 )
x - the left integration point. Corresponds to the F value in look-up tables.
Returns:
The area from x to inf. under the F density function.

finv

public static double finv(double p,
                          int df1,
                          int df2)
Finds the F density argument x such that the integral from 0 to x of the F density is equal to p; formerly named fdtri. That is, it solves for the value of x, given p such that
 fcdf(x,df1,df2) - p = 0.
 
This is accomplished using the inverse beta integral function and the relations
 z = incompleteBetaInverse( df2/2, df1/2, p ),
 x = 1 - df2 * (1-z) / (df1 z),
if p > Gamma.incompleteBeta( 0.5*df1, 0.5*df2, 0.5 );, or p < 0.001 and,
 z = incompleteBetaInverse( df1/2, df2/2, p ),
 x = 1 - df2 * z / (df1 (1-z)),
 
otherwise. The significance level for a one-sided test (or α in look-up tables) is then 1-x

Parameters:
df1 - degrees of freedom of the first dataset
df2 - degrees of freedom of the second dataset
p - the fcdf value for which to solve for x
Returns:
the point x such that fcdf(x,df1,df2) = p.
Throws:
IllegalArgumentException - if df1 or df2 are less than 1, or p is not in (0,1].

gammacdf

public static double gammacdf(double x,
                              double a,
                              double b)
Returns the integral from zero to x of the gamma probability density function.
 
 y = a^b / Γ(b) * Integral_(0 to x) [ t^(b-1) e^(-a*t) ]dt

 
The incomplete gamma integral is used, according to the relation y = Gamma.incompleteGamma( b, a*x ).

Parameters:
a - the paramater a (alpha) of the gamma distribution.
b - the paramater b (beta, lambda) of the gamma distribution.
x - integration end point.
Returns:
y the integral from zero to x of the gamma pdf.

gammacdfComplemented

public static double gammacdfComplemented(double x,
                                          double a,
                                          double b)
Returns the integral from x to infinity of the gamma probability density function:
 
 y = a^b / Γ(b) * Integral_(from x to infinity) [t^(b-1) * e^(-a*t) ] dt

 
The incomplete gamma integral is used, according to the relation

y = Gamma.incompleteGammaComplement( b, a*x ).

Parameters:
a - the paramater a (alpha) of the gamma distribution.
b - the paramater b (beta, lambda) of the gamma distribution.
x - integration start point.

nbinomcdf

public static double nbinomcdf(double p,
                               int k,
                               int n)
Returns the sum of the terms 0 through k of the Negative Binomial Distribution.
 
 Sum_(from j=0 to k)  [ (n+j-1)! / ( j! * (n-1)! ) ] * [ p^n* (1-p)^j ]
   
 
In a sequence of Bernoulli trials, this is the probability that k or fewer failures precede the n-th success.

The terms are not computed individually; instead the incomplete beta integral is employed, according to the formula

y = nbinomcdf( p, k, n ) = Gamma.incompleteBeta( n, k+1, p ). All arguments must be positive,

Parameters:
k - end term.
n - the number of trials.
p - the probability of success (must be in (0.0,1.0)).

nbinomcdfComplemented

public static double nbinomcdfComplemented(double p,
                                           int k,
                                           int n)
Returns the sum of the terms k+1 to infinity of the Negative Binomial distribution.
 
 Sum_(from j=k+1 to inf)  [ (n+j-1)! / ( j! * (n-1)! ) ] * [ p^n* (1-p)^j ]
 
 
The terms are not computed individually; instead the incomplete beta integral is employed, according to the formula

y = nbinomcdfComplemented( p, k, n ) = Gamma.incompleteBeta( k+1, n, 1-p ). All arguments must be positive,

Parameters:
k - (start-1) term.
n - the number of trials.
p - the probability of success (must be in (0.0,1.0)).

normcdf

public static double normcdf(double a)
                      throws ArithmeticException
Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x (assumes mean is zero, variance is one).
 
 normcdf(x) = 1/sqrt(2pi) * Integral_(-inf to x) [ e^(-t^2/2) ] dt
           =  ( 1 + erf(z) ) / 2
           =  erfc(z) / 2
           
 
where z = x/sqrt(2). Computation is via the functions erf and erfComplement.

Relation to alpha value (or confidence level):

Parameters:
a - x in the equation above.
Throws:
ArithmeticException

normcdf

public static double normcdf(double x,
                             double mean,
                             double variance)
                      throws ArithmeticException
Returns the area under the Normal (Gaussian) probability density function, integrated from minus infinity to x when the data has not been standardized (i.e. mean ≠0, variance ≠1).
  
 normcdf(x,mean,variance) = 1/sqrt(2pi*v) * Integral_(-inf to x) [ e^(-(t-mean)^2 / 2v) ] dt

 
where v = variance. Computation is via the functions erf.

Parameters:
mean - the mean of the normal distribution.
variance - the variance of the normal distribution.
x - the integration limit.
Throws:
ArithmeticException

norminv

public static double norminv(double y0)
                      throws ArithmeticException
Returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one); formerly named ndtri. That is,

normcdf(x) = y0.

For small arguments 0 < y < exp(-2), the program computes z = sqrt( -2.0 * log(y) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z). There are two rational functions P/Q, one for 0 < y < exp(-32) and the other for y up to exp(-2). For larger arguments, w = y - 0.5, and x/sqrt(2pi) = w + w**3 R(w**2)/S(w**2)). Finally, this returns -infinity for y0 = 0 and +infinity for y0 = 1.

Parameters:
y0 - the argument to the function
Throws:
ArithmeticException

poissoncdf

public static double poissoncdf(double mean,
                                int k)
                         throws ArithmeticException
Returns the sum of the first k terms of the Poisson distribution.
 
 Sum_(j=0 to k) [ e^(-m) *  m^j /j! ]

 
The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation

y = poissoncdf( mean, k m ) = Gamma.incompleteGammaComplement( k+1, mean ). The arguments must both be positive.

Parameters:
k - number of terms.
mean - the mean of the poisson distribution.
Throws:
ArithmeticException

poissoncdfComplemented

public static double poissoncdfComplemented(int k,
                                            double mean)
                                     throws ArithmeticException
Returns the sum of the terms k+1 to Infinity of the Poisson distribution.
  
 Sum_(j=k+1 to inf.) [ e^(-m) *  m^j /j! ]
 
 
The terms are not summed directly; instead the incomplete gamma integral is employed, according to the formula

y = poissoncdfComplemented( k, m ) = Gamma.incompleteGamma( k+1, m ). The arguments must both be positive.

Parameters:
k - (start-1) term.
mean - the mean of the poisson distribution.
Throws:
ArithmeticException

tcdf

public static double tcdf(double t,
                          double k)
                   throws ArithmeticException
Returns the integral from minus infinity to t of the Student-t distribution with k > 0 degrees of freedom.
 
 Γ((k+1)/2) / [ sqrt(k*pi) * Γ(k/2)  ] * Integral_(-inf to t) [  ( 1+x^2/k )^( (-k+1)/2 ) ] dx
 
 
Relation to incomplete beta integral:


tcdf(t,k) = 0.5 * Gamma.incompleteBeta( k/2, 1/2, z ), when t < 0,

tcdf(t,k) = 1-0.5 * Gamma.incompleteBeta( k/2, 1/2, z ), when t > 0,

where z = k/(k + t^2). Relation to alpha value (or confidence level):

Parameters:
k - degrees of freedom.
t - integration end point.
Throws:
ArithmeticException

tinv

public static double tinv(double p,
                          double k)
Returns the value, t, for which the area under the Student-t probability density function (integrated from minus infinity to t) is equal to p. The function uses the tcdf function to determine the return value iteratively.

Parameters:
p - the fcdf value for which to solve for x
k - the degrees of freedom


Processing library papaya by Adila Faruk. (C) 2014