What is meant by probability density function

  • ‹ Probability
  • Probability Distribution ›

What is a Probability Density Function?

A Probability Density Function is a statistical expression used in probability theory as a way of representing the range of possible values of a continuous random variable. The area under the curve represents the interval of which a continuous random variable will fall, and the total area of the interval represents the probability that the variable will occur. The probability density function differs from a probability mass function that is used when calculating the probabilities of discrete random variables.

What is meant by probability density function

Source

How does a Probability Density Function work?

The Probability Density Function works by conceptualizing the probabilities of a continuous random event occurring by defining a range, or interval. For example, if one wanted to calculate the probability that a specific temperature, say 70 degrees, will be reached, they may turn to a probability mass function, as the variable is defined in discrete terms. However, if one wanted to calculate the probability that a temperature between 70-75 degrees will be reached, they may use a probability density function, as the variable is defined as a range with infinite discrete values. Since the Probability Density Function defines probabilities with intervals, the probability of a single discrete value is defined as zero, since it does not have a range.

Probability Density Functions and Machine Learning

A Probability Density Function is a tool used by machine learning algorithms and neural networks that are trained to calculate probabilities from continuous random variables. For example, a neural network that is looking at financial markets and attempting to guide investors may calculate the probability of the stock market rising 5-10%. To do so, it could use a Probability Density Function in order to calculate the total probability that the continuous random variable range will occur.

A continuous random variable takes on an uncountably infinite number of possible values. For a discrete random variable \(X\) that takes on a finite or countably infinite number of possible values, we determined \(P(X=x)\) for all of the possible values of \(X\), and called it the probability mass function ("p.m.f."). For continuous random variables, as we shall soon see, the probability that \(X\) takes on any particular value \(x\) is 0. That is, finding \(P(X=x)\) for a continuous random variable \(X\) is not going to work. Instead, we'll need to find the probability that \(X\) falls in some interval \((a, b)\), that is, we'll need to find \(P(a<X<b)\). We'll do that using a probability density function ("p.d.f."). We'll first motivate a p.d.f. with an example, and then we'll formally define it.

Example 14-1 Section

What is meant by probability density function

Even though a fast-food chain might advertise a hamburger as weighing a quarter-pound, you can well imagine that it is not exactly 0.25 pounds. One randomly selected hamburger might weigh 0.23 pounds while another might weigh 0.27 pounds. What is the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds? That is, if we let \(X\) denote the weight of a randomly selected quarter-pound hamburger in pounds, what is \(P(0.20<X<0.30)\)?

Solution

In reality, I'm not particularly interested in using this example just so that you'll know whether or not you've been ripped off the next time you order a hamburger! Instead, I'm interested in using the example to illustrate the idea behind a probability density function.

Now, you could imagine randomly selecting, let's say, 100 hamburgers advertised to weigh a quarter-pound. If you weighed the 100 hamburgers, and created a density histogram of the resulting weights, perhaps the histogram might look something like this:

X 0.25 Density

In this case, the histogram illustrates that most of the sampled hamburgers do indeed weigh close to 0.25 pounds, but some are a bit more and some a bit less. Now, what if we decreased the length of the class interval on that density histogram? Then, the density histogram would look something like this:

X 0.25 Density

Now, what if we pushed this further and decreased the intervals even more? You can imagine that the intervals would eventually get so small that we could represent the probability distribution of \(X\), not as a density histogram, but rather as a curve (by connecting the "dots" at the tops of the tiny tiny tiny rectangles) that, in this case, might look like this:

X 0.25 f(x)

Such a curve is denoted \(f(x)\) and is called a (continuous) probability density function.

Now, you might recall that a density histogram is defined so that the area of each rectangle equals the relative frequency of the corresponding class, and the area of the entire histogram equals 1. That suggests then that finding the probability that a continuous random variable \(X\) falls in some interval of values involves finding the area under the curve \(f(x)\) sandwiched by the endpoints of the interval. In the case of this example, the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds is then this area:

X 0.20 0.30 f(x) Area = Probability P(0.20<X<0.30)

Now that we've motivated the idea behind a probability density function for a continuous random variable, let's now go and formally define it.

Probability Density Function ("p.d.f.")

The probability density function ("p.d.f.") of a continuous random variable \(X\) with support \(S\) is an integrable function \(f(x)\) satisfying the following:

  1. \(f(x)\) is positive everywhere in the support \(S\), that is, \(f(x)>0\), for all \(x\) in \(S\)

  2. The area under the curve \(f(x)\) in the support \(S\) is 1, that is:

    \(\int_S f(x)dx=1\)

  3. If \(f(x)\) is the p.d.f. of \(x\), then the probability that \(x\) belongs to \(A\), where \(A\) is some interval, is given by the integral of \(f(x)\) over that interval, that is:

    \(P(X \in A)=\int_A f(x)dx\)

As you can see, the definition for the p.d.f. of a continuous random variable differs from the definition for the p.m.f. of a discrete random variable by simply changing the summations that appeared in the discrete case to integrals in the continuous case. Let's test this definition out on an example.

Example 14-2 Section

Let \(X\) be a continuous random variable whose probability density function is:

\(f(x)=3x^2, \qquad 0<x<1\)

First, note again that \(f(x)\ne P(X=x)\). For example, \(f(0.9)=3(0.9)^2=2.43\), which is clearly not a probability! In the continuous case, \(f(x)\) is instead the height of the curve at \(X=x\), so that the total area under the curve is 1. In the continuous case, it is areas under the curve that define the probabilities.

Now, let's first start by verifying that \(f(x)\) is a valid probability density function.

Solution

What is the probability that \(X\) falls between \(\frac{1}{2}\) and 1? That is, what is \(P\left(\frac{1}{2}<X<1\right)\)?

Solution

What is \(P\left(X=\frac{1}{2}\right)\)?

Solution

It is a straightforward integration to see that the probability is 0:

\(\int^{1/2}_{1/2} 3x^2dx=\left[x^3\right]^{x=1/2}_{x=1/2}=\dfrac{1}{8}-\dfrac{1}{8}=0\)

In fact, in general, if \(X\) is continuous, the probability that \(X\) takes on any specific value \(x\) is 0. That is, when \(X\) is continuous, \(P(X=x)=0\) for all \(x\) in the support.

An implication of the fact that \(P(X=x)=0\) for all \(x\) when \(X\) is continuous is that you can be careless about the endpoints of intervals when finding probabilities of continuous random variables. That is:

\(P(a\le X\le b)=P(a<X\le b)=P(a\le X<b)=P(a<x<b)\)

for any constants \(a\) and \(b\).

Example 14-3 Section

Let \(X\) be a continuous random variable whose probability density function is:

\(f(x)=\dfrac{x^3}{4}\)

for an interval \(0<x<c\). What is the value of the constant \(c\) that makes \(f(x)\) a valid probability density function?

Solution

  • PreviousLesson 14: Continuous Random Variables
  • Next14.2 - Cumulative Distribution Functions

What is meant by probability density function Mcq?

Probability Density Function. It indicates the distribution of the total probability of various random variables. f ( x ) = d F ( x ) d x.

What is probability density function give an example?

Probability Density Function Example Say we have a continuous random variable whose probability density function is given by f(x) = x + 2, when 0 < x ≤ 2. We want to find P(0.5 < X < 1). Then we integrate x + 2 within the limits 0.5 and 1. This gives us 1.375.

What is probability density function formula?

The probability density function (pdf) f(x) of a continuous random variable X is defined as the derivative of the cdf F(x): f(x)=ddxF(x). It is sometimes useful to consider the cdf F(x) in terms of the pdf f(x): F(x)=∫x−∞f(t)dt.(