Previous Table of Contents Next


29.9 GEOMETRIC DISTRIBUTION

The distribution of number of trials up to and including the first success in a sequence of Bernoulli trials is called a geometric distribution. The key characteristics of the geometric distribution are summarized in Table 29.9.

TABLE 29.9 Geometric Distribution G(p)


1.  Parameters: p = probability of success, 0 < p < 1
2.  Range: x = 1,2,..., ∞
3.  pmf: f(x) = (1 – p)x–1p
4.  CDF: F(x) = 1 – (1 – p)x
5.  Mean: 1/p
6.  Variance:

The geometric distribution is a discrete equivalent of the exponential distribution. It is a memoryless distribution in the sense that remembering the results of past attempts does not help in predicting the future.

The geometric distribution is used to model the number of attempts between successive failures (or successes), for example,

1.  the number of local queries to a database between successive accesses to the remote database,
2.  the number of packets successfully transmitted between those requiring a retransmission, or
3.  the number of successive error-free bits between in-effor bits in a packet received on a noisy link.

Another common application of the geometric distribution is to model batch sizes with batches arriving in a Poisson stream. Under this condition, the arrivals remain memoryless and are easy to model.

Geometric variates can be easily generated using inverse transformation. Generate a U(0, 1) random number u and compute

Here, [·] denotes rounding up to the next larger integer.

TABLE 29.10 Lognormal Distribution LN(µμ,)


1.  Parameters: µ = mean of ln(x), µ > 0
α = standard deviation of on(x), α > 0
2.  Range: 0 ≤ x ≤ ∞
3.  pdf:
4.  Mean: eµ + α 2/2
5.  Variance: e2µ + α2(eα2 – 1)

29.10 LOGNORMAL DISTRIBUTION

The log of a normal variate has a lognormal distribution. In regression modeling and analysis of experimental designs, often log transformation is used. In such cases, the response in the transformed model has a normal distribution while the original response has a lognormal distribution. The key characteristics of a lognormal distribution are summarized in Table 29.10.

It must be noticed that µ and σ are the mean and standard deviation of In(x) and should not be confused with those for the lognormal variate x.

The product of a large number of positive random variables tends to have an approximate lognormal distribution. It is therefore used to model errors that are a product of effects of a large number of factors.

Lognormal variates can be generated using a log of a normal variate. Generate x ~ N(0,1) and return eµ + α x.

29.11 NEGATIVE BINOMIAL DISTRIBUTION

In a sequence of Bernoulli trials, the number of failures x before the mth success has a negative binomial distribution. The key characteristics of a negative binomial distribution are summarized in Table 29.11.

The negative binomial distribution is used to model the number of failures before the mth success; for example:

TABLE 29.11 Negative Binomial Distribution NB(p,m)


1.  Parameters: p = probability of success, 0 < p > 1
m = number of successes; m must be a positive integer
2.  Range x = 0,1,2,..., ∞
3.  pmf:
The second expression allows a negative binomial to the defined for noninteger values of x.
4.  Mean: m(1 – p)/p
5.  Variance: m(1 – p)/p2

1.  Number of local queries to a database system before the mth remote query
2.  Number of retransmissions for a message consisting of m packets
3.  Number of effor-free bits received on a noisy link before the m in-error bit

The variance of NB(p,m) is greater than the mean for all values of p and m. Therefore, this distribution may be used in place of a Poisson distribution, which has a variance equal to the mean, or in place of a binomial distribution, which has a variance less than the mean.

Negative binomial variates can be generated as follows:

1.  Generate ui ~ U(0, 1) until m of the ui’s are greater than p. Return the count of ui’s less than or equal to p as NB(p,m).
2.  The sum of m geometric variates G(p) gives the total number of trials for m successes. Thus, NB(p,m) can be obtained from m geometric variates as follows:
  

3.  The following composition method may he used for integer as well as noninteger values of m:
(a) Generate a gamma variate y ~ Γ(p/(1 – p), m).
(b) Generate a Poisson variate x ~ Poisson(y).
(c) Return x as NB(p,m).

29.12 NORMAL DISTRIBUTION

Also known as Gaussian distribution, the normal distribution was actually discovered by Abraham De Moivre in 1733. Gauss and Laplace rediscovered it in 1809 and 1812, respectively. The normal distribution N(0, 1) with µ = 0 and σ = 1 is called the unit normal distribution or standard normal distribution. The key characteristics of the normal distribution are summarized in Table 29.12.

TABLE 29.12 Normal Distribution N(µ, σ)


1.  Parameter: µ = mean
α = standard deviation α > 0
2.  Range: – ∞ ≤ x ≤ ∞
3.  pdf:
4.  Mean: µ
5.  Variance: α2

The normal distribution is used whenever the randomness is caused by several independent sources acting additively; for example:

1.  Errors in measurement
2.  Error in modeling to account for a number of factors that are not included in the model
3.  Sample means of a large number of independent observations from a given distribution

Normal variates can be generated as follows:

1.  Convolution: The sum of a large number of uniform ui ~ U(0, 1) variates has a normal distribution:


Generally, n = 12 is used.
2.  Box-Muller Method (Box and Muller (1958)): Generate two uniform variates u1 and u2 and compute two independent normal variates N(µ,σ) as follows:


There is some concern that if this method is used with u’s from an LCG, the resulting x’s may be correlated. See Edgeman (1989).
3.  Polar Method (Marsaglia and Bray (1964)):
(a) Generate two U(0, 1) variates u1 and u2.
(b) Let v1 = 2u1 – 1, v2 = 2u2 – 1, and r = v12 + v12
(c) If r ≥ 1, go back to step 3a; otherwise let s = [(–21nr)/r]1/2 and return:


x1 and x1 are two independent N(µ, σ) variates.
4.  Rejection Method:
(a) Generate two uniform U(0, 1) variates u1 and u2.
(b) Let x = – In u1.
(c) If u2 > e–(x–1)2/2, go back to step 4a.
(d) Generate u3.
(e) If u3 > 0.5, return µ + σ x; otherwise return µ – σx.

TABLE 29.13 Pareto Distribution Pareto(a)


1.  Parameters: a = shape parameter, a > 0
2.  Range 1 ≤ x ≤ ∞
3.  pdf: f(x) = ax–(a+1)
4.  CDF: F(x) = 1 – xa
5.  Mean: provided a>1
6.  Variance: provided a>2


Previous Table of Contents Next

Copyright © John Wiley & Sons, Inc.