Normal distribution in R | Examples

The Normal distribution functions

dnorm(x, mean = 0, sd = 1, log = FALSE)
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
rnorm(n, mean = 0, sd = 1)

If mean or sd are not specified they assume the default values of 0 and 1, respectively.

The normal distribution has density:

$f(x)=\Large \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}}$

where $\mu$ is the mean of the distribution and $\sigma$ the standard deviation.

dnorm gives the PDF density
pnorm gives the CDF function
qnorm gives the quantile function
rnorm generates random deviates

For rnorm the length of the result is determined by n , and is the maximum of the lengths of the numerical arguments for the other functions.

For sd = 0 this gives the limit as sd decreases to 0, a point mass at $\mu$. sd < 0 is an error and returns NaN.

Example: Area under PDF based on condition

What is the percentage of students with the height > 170 if height is distributed normally $\mathcal N(150,20)$.

We can solve this very easy if we just know the percentage inside $[\mu-\sigma, \mu+\sigma]$ is ~68.2%.

Our percentage should be $(1-68.2)/2 \simeq 15.9$

In R we can integrate the dnorm to get the output.

integrate(dnorm, mean=150, sd=20, lower= 170, upper= Inf, abs.tol = 0)$value

Out

0.1586553

The same answer we get with:

1-pnorm(170, mean=150, sd=20)

dnorm is PDF, pnorm is CDF

Example: Random variable difference

Men have a mean height of 178cm with a standard deviation of 8cm. Womnen have a mean height of 170cm with a standard deviation of 6cm. Male and female heights are normally distributed. What is the probability that the woman is taller than the man?

$M = \mathcal N(178, 8), W = \mathcal N(170, 6)$

We are interested to find the new random variable $D = M-W$ of the difference.

To calculate:

$\mu_D = \mu_M -\mu_W=8 \ \sigma_D^2 = \sigma_M^2 +\sigma_W^2=100$

$\therefore \sigma_D = 10, \ D \sim \mathcal N(2,10)$

To calculate the probability woman is taller than man:

$\mathbb P(W \gt M) = \mathbb P(M-W < 0) = \mathbb P( D \lt 0)$

pnorm(0, mean=8, sd=10 )

Out:

0.211855398583397

Example:_ Combine two random variables_

Summer drives to work and back. The amount of fuel he uses follows a normal distribution:

To work: $\quad \mu_{W}=10 \mathrm{~L} \quad \sigma_{W}=1.5 \mathrm{~L}$

To home: $\quad \mu_{H}=10 \mathrm{~L} \quad \sigma_{H}=2 \mathrm{~L}$

If he has $25L$ of fuel and he intends to drive to work and back home. What is the probability that he runs out of fuel?

To calculate this we identify two random variables.

$W \sim \mathcal N(10, 1.5) \ H \sim \mathcal N(10, 2)
\therefore
B = \mathcal N(10+10, \sqrt{1.5^2+2^2}) = \mathcal N(20, 2.5)$

To run out of fuel we need $\mathbb P(B>25)$

1- pnorm(25, mean=20, sd=2.5 )

Out:

0.0227501319481792

Example: Calculate the $\sigma$ interval around the mean percentage

Get the percentage in area $[\mu - \sigma, \mu + \sigma ]$

bef <- pnorm(-1, mean=0, sd=1 )
bef
aft <- 1 - pnorm(1, mean=0, sd=1 )
aft
# finally
1-(bef+aft)

Out:

158655253931457
158655253931457
682689492137086

Example: A random variable $X \sim \mathcal N(37,7)$. Find the following probabilities:

a) $\mathbb P(x<25)$
b) $\mathbb P(x>42)$
e) $\mathbb P(25<x<42)$

par(mfrow=c(1,2))
curve(dnorm(x,35,7), 10, 60, lwd=2, ylab="PDF", main="NORM(35,7)")
abline(h=0,col="green2"); abline(v = 25, col="red", lty="dashed")
curve(dnorm(x,35,7), 10, 60, lwd=2, ylab="PDF", main="NORM(35,7)")
abline(h=0,col="green2"); abline(v = 42, col="blue", lty="dashed")

normal example

For the a) case we can integrate dnorm where $x<25$

integrate(dnorm, mean=35, sd=7, lower= -Inf, upper=25, abs.tol = 0)$value

Out:

0.07656373

Exact same result would be to call pnorm(25, mean=35, sd=7 )

b) Again we can integrate but different region from $x>42$

integrate(dnorm, mean=35, sd=7, lower=42, upper=Inf, abs.tol = 0)$value

Out:

0.1586553

The same result we may get using the pnorm function:

1-pnorm(42, mean=35, sd=7)

par(mfrow=c(1,1))
curve(dnorm(x,35,7), 10, 60, lwd=2, ylab="PDF", main="NORM(35,7)")
abline(h=0,col="green2"); 
abline(v = c(25,42), col="darkgreen", lty="dashed")

normal example

c) To solve the region $24<x<42$ we may integrate dnorm again:

integrate(dnorm, mean=35, sd=7, lower=25, upper=42, abs.tol = 0)$value

Out:

0.764781

Or we may use:

1 - pnorm(25, mean=35, sd=7) -(1-pnorm(42, mean=35, sd=7))

Example: Product of two random variables

If we have two random variables $X \sim \mathcal N(\mu_1, \sigma_1)$ and $Y \sim \mathcal N(\mu_2, \sigma_2)$

Then the effective $\mu$ and $\sigma$ of the product would be:

\[\left(\sigma_{1}^{2}+\sigma_{2}^{2}\right) \mu=\mu_{1} \sigma_{2}^{2}+\mu_{2} \sigma_{1}^{2}, \quad \frac{1}{\sigma^{2}}=\frac{1}{\sigma_{1}^{2}}+\frac{1}{\sigma_{2}^{2}}\]

Based on the fact Gaussian exponents are quadratic:

\[\frac{1}{\sigma_{1}^{2}}\left(x-\mu_{1}\right)^{2}+\frac{1}{\sigma_{2}^{2}}\left(x-\mu_{2}\right)^{2}=\frac{1}{\sigma^{2}}(x-\mu)^{2}+C\]

…

tags: pdf & category: r