# Probability Density Function

## Definition of Probability Density Function

We call X a continuous random variable if X can take any value on an interval, which is often the entire set of real numbers .

Every continuous random variable X has a probability density function (PDF) written f (x), that satisfies the following conditions:

1. f (x) ≥ 0 for all x, and
2. $$\int\limits_{ - \infty }^\infty {f\left( x \right)dx} = 1.$$

The probability that a random variable $$X$$ takes on values in the interval $$a \le X \le b$$ is defined as

$P\left( {a \le X \le b} \right) = \int\limits_a^b {f\left( x \right)dx} ,$

which is the area under the curve $$f\left( x \right)$$ from $$x = a$$ to $$x = b.$$

## Mean and Median

If a random variable $$X$$ has a density function $${f\left( x \right)},$$ then we define the mean value (also known as the average value or the expectation) of $$X$$ as

$\mu = \int\limits_{ - \infty }^\infty {xf\left( x \right)dx}.$

The median of a continuous probability distribution $$f\left( x \right)$$ is the value of $$x = m$$ that splits the probability distribution into two portions whose areas are identical and equal to $$\frac{1}{2}:$$

$\int\limits_{ - \infty }^m {f\left( x \right)dx} = \int\limits_m^\infty {f\left( x \right)dx} = \frac{1}{2}.$

Note that not all $$PDFs$$ have mean values. For example, the Cauchy distribution is an example of a probability distribution which has no mean.

## Variance

The variance of a continuous random variable is defined by the integral

${\sigma ^2} = \int\limits_{ - \infty }^\infty {{{\left( {x - \mu } \right)}^2}f\left( x \right)dx} ,$

where $$\mu$$ is the mean of the random variable $$X.$$

## Uniform Distribution

The simplest $$PDF$$ is the uniform distribution. The density of the uniform distribution is defined by

$f\left( x \right) = \frac{1}{{b - a}}\;\;\text{for}\;\;a \le x \le b.$

The mean value of the uniform distribution across the interval $$\left[ {a,b} \right]$$ is

$\mu = \int\limits_a^b {xf\left( x \right)dx} = \frac{{a + b}}{2}.$

If a random variable $$X$$ is distributed uniformly in the interval $$\left[ {a,b} \right],$$ the probability to fall within a range $$\left[ {c,d} \right] \in \left[ {a,b} \right]$$ is expressed by the formula

$P\left( {c \le X \le d} \right) = \int\limits_c^d {f\left( x \right)dx} = \int\limits_c^d {\frac{{dx}}{{b - a}}} = \frac{{d - c}}{{b - a}}.$

The variance of the distribution is

${\sigma ^2} = \int\limits_a^b {{{\left( {x - \mu } \right)}^2}f\left( x \right)dx} = \frac{{{{\left( {b - a} \right)}^2}}}{{12}}.$

## Exponential Distribution

The exponential distribution is a continuous distribution that is commonly used to describe the waiting time until some specific event occurs. For example, the amount of time until a hurricane or other dangerous weather event occurs obeys an exponential distribution law.

The one-parameter exponential distribution of the probability density function $$PDF$$ is described as follows:

$f\left( x \right) = \lambda {e^{ - \lambda x}},\;\;x \ge 0,$

where the rate $$\lambda$$ represents the average amount of events per unit of time.

The mean value (or the average waiting for the next event) is $$\mu = \frac{1}{\lambda }.$$ The median of the exponential distribution is $$m = \frac{{\ln 2}}{\lambda },$$ and the variance is given by $${\sigma ^2} = \frac{1}{{{\lambda ^2}}}.$$

## Normal Distribution

The normal distribution is the most widely known probability distribution since it describes many natural phenomena.

The $$PDF$$ of the normal distribution is given by the formula

$f\left( x \right) = \frac{1}{{\sqrt {2\pi {\sigma ^2}} }}{e^{ - \frac{{{{\left( {x - \mu } \right)}^2}}}{{2{\sigma ^2}}}}},$

where $$\mu$$ is the mean of the distribution, and $${\sigma^2}$$ is the variance.

The two parameters $$\mu$$ and $${\sigma}$$ entirely define the shape and all other properties of the normal distribution function.

If a random variable $$X$$ follows the normal distribution with the parameters $$\mu$$ and $$\sigma,$$ we write $$X \sim N\left( {\mu ,\sigma } \right).$$

The normal distribution is said to be standard when $$\mu = 0$$ and $$\sigma = 1.$$ In this special case, the normal random variable $$X$$ is called a standard score or a $$Z-$$score. Thus, by definition, $$Z \sim N\left( {0 ,1} \right).$$

Every normal random variable $$X$$ can be transformed into a $$Z-$$score by using the substitution

$z = \frac{{x - \mu }}{\sigma }.$

Pay attention to the notations: $$X, Z$$ denote the random variables, and $$x,z$$ denote the possible values of the variables.

To compute probabilities for $$Z,$$ we use a standard normal table ($$Z-$$table) or a software tool.

To find the probability that a normally distributed random variable $$X$$ falls within a range $$\left[ {a,b } \right],$$ we rely on the $$Z-$$score and use the formula

$P\left( {a \le X \le b} \right) = P\left( {\frac{{a - \mu }}{\sigma } \le Z \le \frac{{b - \mu }}{\sigma }} \right).$

## Solved Problems

### Example 1.

Calculate the mean value $$\mu$$ and the variance $${\sigma^2}$$ of the uniform distribution $f\left( x \right) = \frac{1}{{b - a}}$ for $$a \le x \le b.$$

Solution.

First we find the mean $$\mu:$$

$\mu = \int\limits_a^b {xf\left( x \right)dx} = \int\limits_a^b {\frac{{xdx}}{{b - a}}} = \frac{1}{{b - a}}\int\limits_a^b {xdx} = \frac{1}{{b - a}}\left. {\left( {\frac{{{x^2}}}{2}} \right)} \right|_a^b = \frac{1}{{b - a}} \cdot \frac{{{b^2} - {a^2}}}{2} = \frac{{\cancel{\left( {b - a} \right)}\left( {b + a} \right)}}{{2\cancel{\left( {b - a} \right)}}} = \frac{{a +b}}{2}.$

Now let's derive the expression for the variance $${\sigma ^2}.$$ By definition,

${\sigma ^2} = \int\limits_a^b {{{\left( {x - \mu } \right)}^2}f\left( x \right)dx} .$

Expanding the square in the integrand, we can write:

${\sigma ^2} = \int\limits_a^b {\left( {{x^2} - 2\mu x + {\mu ^2}} \right)f\left( x \right)dx} = \int\limits_a^b {{x^2}f\left( x \right)dx} - 2\mu \int\limits_a^b {xf\left( x \right)dx} + {\mu ^2}\int\limits_a^b {f\left( x \right)dx} .$

Recall that

$\int\limits_a^b {xf\left( x \right)dx} = \mu ,\;\;\; \int\limits_a^b {f\left( x \right)dx} = 1.$

Then

${\sigma ^2} = \int\limits_a^b {{x^2}f\left( x \right)dx} - 2{\mu ^2} + {\mu ^2} = \int\limits_a^b {{x^2}f\left( x \right)dx} - {\mu ^2} = \frac{1}{{b - a}}\int\limits_a^b {{x^2}dx} - {\left( {\frac{{a + b}}{2}} \right)^2} = \frac{1}{{b - a}}\left. {\frac{{{x^3}}}{3}} \right|_a^b - {\left( {\frac{{a + b}}{2}} \right)^2} = \frac{{{b^3} - {a^3}}}{{3\left( {b - a} \right)}} - {\left( {\frac{{a + b}}{2}} \right)^2} = \frac{{{b^2} + ab + {a^2}}}{3} - \frac{{{a^2} + 2ab + {b^2}}}{4} = \frac{{{b^2} - 2ab + {a^2}}}{{12}} = \frac{{{{\left( {b - a} \right)}^2}}}{{12}}.$

### Example 2.

Let $$X$$ be a random variable distributed uniformly in the interval $$\left[ {{x_0} - L,{x_0} + L} \right].$$ Find the mean $$\mu$$ and variance $${\sigma^2}$$ of the random variable $$X.$$

Solution.

Make sure that the mean value coincides with the middle of the interval:

$\mu = \int\limits_{{x_0} - L}^{{x_0} + L} {xf\left( x \right)dx} = \frac{1}{{2L}}\int\limits_{{x_0} - L}^{{x_0} + L} {xdx} = \frac{1}{{2L}}\left. {\frac{{{x^2}}}{2}} \right|_{{x_0} - L}^{{x_0} + L} = \left. {\frac{{{x^2}}}{{4L}}} \right|_{{x_0} - L}^{{x_0} + L} = \frac{1}{{4L}}\left[ {{{\left( {{x_0} + L} \right)}^2} - {{\left( {{x_0} - L} \right)}^2}} \right] = \frac{1}{{4L}}\left[ {\cancel{x_0^2} + 2{x_0}L + \cancel{L^2} - \cancel{x_0^2} + 2{x_0}L - \cancel{L^2}} \right] = \frac{{\cancel{4}{x_0}\cancel{L}}}{{\cancel{4L}}} = {x_0}.$

Compute the variance:

${\sigma ^2} = \int\limits_{{x_0} - L}^{{x_0} + L} {{{\left( {x - \mu } \right)}^2}f\left( x \right)dx} = \frac{1}{{2L}}\int\limits_{{x_0} - L}^{{x_0} + L} {{{\left( {x - {x_0}} \right)}^2}dx} = \frac{1}{{2L}}\left. {\frac{{{{\left( {x - {x_0}} \right)}^3}}}{3}} \right|_{{x_0} - L}^{{x_0} + L} = \frac{1}{{6L}}\left[ {{L^3} - {{\left( { - L} \right)}^3}} \right] = \frac{{2{L^3}}}{{6L}} = \frac{{{L^2}}}{3}.$

### Example 3.

Find the mean value $$\mu$$ and the median $$m$$ of the exponential distribution $f\left( x \right) = \lambda {e^{ - \lambda x}}.$

Solution.

The mean value $$\mu$$ is determined by the integral

$\mu = \int\limits_{ - \infty }^\infty {xf\left( x \right)dx} = \lambda \int\limits_0^\infty {x{e^{ - \lambda x}}dx} .$

Integrating by parts, we have

$\mu = \lambda \int\limits_0^\infty {x{e^{ - \lambda x}}dx} = \left[ {\begin{array}{*{20}{l}} {u = x}\\ {dv = {e^{ - \lambda x}}dx}\\ {du = dx}\\ {v = - \frac{1}{\lambda }{e^{ - \lambda x}}} \end{array}} \right] = \lambda \left[ { - \left. {\frac{x}{\lambda }{e^{ - \lambda x}}} \right|_0^\infty - \int\limits_0^\infty {\left( { - \frac{1}{\lambda }{e^{ - \lambda x}}} \right)dx} } \right] = \int\limits_0^\infty {{e^{ - \lambda x}}dx} - \left. {x{e^{ - \lambda x}}} \right|_0^\infty = - \frac{1}{\lambda }\left. {{e^{ - \lambda x}}} \right|_0^\infty - \left. {x{e^{ - \lambda x}}} \right|_0^\infty .$

We evaluate the second term with the help of l'Hopital's Rule:

$\left. {x{e^{ - \lambda x}}} \right|_0^\infty = \lim \limits_{b \to \infty } \left[ {\left. {b{e^{ - \lambda b}}} \right|_0^b} \right] = \lim \limits_{b \to \infty } \frac{b}{{{e^{\lambda b}}}} = \left[ {\frac{\infty }{\infty }} \right] = \lim \limits_{b \to \infty } \frac{{b^\prime}}{{\left( {{e^{\lambda b}}} \right)^\prime}} = \lim \limits_{b \to \infty } \frac{1}{{\lambda {e^{\lambda b}}}} = 0.$

Hence, the mean (average) value of the exponential distribution is

$\mu = - \frac{1}{\lambda }\left. {{e^{ - \lambda x}}} \right|_0^\infty = - \frac{1}{\lambda }\left( {0 - 1} \right) = \frac{1}{\lambda }.$

Determine the median $$m:$$

$\int\limits_{ - \infty }^m {f\left( x \right)dx} = \frac{1}{2},\;\; \Rightarrow \lambda \int\limits_0^m {{e^{ - \lambda x}}dx} = \frac{1}{2},\;\; \Rightarrow \lambda \left. {\left( { - \frac{1}{\lambda }{e^{ - \lambda x}}} \right)} \right|_0^m = \frac{1}{2},\;\; \Rightarrow - {e^{ - \lambda m}} + {e^0} = \frac{1}{2},\;\; \Rightarrow {e^{ - \lambda m}} = \frac{1}{2},\;\; \Rightarrow {e^{\lambda m}} = 2,\;\; \Rightarrow \lambda m = \ln 2,\;\; \Rightarrow m = \frac{{\ln 2}}{\lambda }.$

### Example 4.

Assume that the waiting time for your next email is described by the exponential density function with rate $$\lambda = 3$$ (emails per hour). Determine the probability that you receive no email during the next hour.

Solution.

The probability density function has the form

$f\left( t \right) = \lambda {e^{ - \lambda t}} = 3{e^{ - 3t}},$

where the time $$t$$ is measured in hours.

Let's calculate the probability that you receive an email during the hour. Integrating the exponential density function from $$t = 0$$ to $$t = 1,$$ we have

$P\left( {0 \le t \le 1} \right) = \int\limits_0^1 {f\left( t \right)dt} = \int\limits_0^1 {3{e^{ - 3t}}dt} = 3\int\limits_0^1 {{e^{ - 3t}}dt} = 3 \cdot \left. {\left( { - \frac{1}{3}{e^{ - 3t}}} \right)} \right|_0^1 = 1 - {e^{ - 3}}.$

So, the probability $${P^C}$$ of the opposite (complementary) event (that is, that you will not receive any email within an hour) is equal to

${P^C} = 1 - P\left( {0 \le t \le 1} \right) = 1 - \left( {1 - {e^{ - 3}}} \right) = {e^{ - 3}} \approx 0.05 = 5\%$

See more problems on Page 2.