Cumulative Distribution Functions and MATLAB

My textbook for my current statistics class does a pitiful job of describing the cumulative density function (CDF). It simply states:

FX(x) = P{X<=x}

The book then jumps onward for four pages talking about different types of random variables. So what is the CDF then? What is its significance? The CDF of a random variable X is the probability that the value of the random variable X is less than or equal to the parameter ‘x’. The CDF is bounded such that negative infinite as the parameter would result in ‘0’ (zero) and positive infinite would result in ‘1’ (one). As the value of the parameter ‘x’ increases, the probability that the random variable X is less than or equal to ‘x’ also increases.

As an example, let X be a random variable representing the price of stocks on the NASDAQ market. If you wanted to compute FX($0.01), which would be the probability that a stock on the NASDAQ market is currently priced at $0.01 or less, the result would in all likelihood be quite small. On the other hand, FX($500) would return a number very close to 1.0 depending on the state of the market (note: Google is at $625.77 as of yesterday).  FX($15.00) would return a number between 0 and 1 that is probably somewhat closer to 0.5 perhaps.

So what does that buy? Well, the CDF is a model that can help one understand a random variable. We can come up with a set of parameters to pass into the CDF that will help us better understand how the values of the random variable are distributed across the population. With our NASDAQ example, we could pass in $1.00, $15.00, $25,.00, $50.00, and $100.00 to get a feel for where the majority of the stock prices are spread out in price range.

How does one compute the CDF then? Well, if we are not given the model for the CDF, then we need to know more about the type of random variable. Take a normal or Gaussian random variable X. There is a mathematical expression this type of random variable, such as this nasty integral for the normal distribution function, and having not done any calculus since 2001, there is no way in hell I will deal with the integrals. Hopefully if I only had a calculator and a statistics book I would be able to use some estimations and a precalculated table in the back of the book to work out the result. Better yet, I’d be at my desk with access to MATLAB so that I can use the built in function for the CDF–wait for it–named ‘cdf’. For a normal/Gaussian random variable, to compute the CDF, use:

cdf('Normal', x, mu, sigma)

The first parameters selects the type of the random variable, the second parameter is the original CDF parameter from the mathematical expression at the beginning of this post, and the third and forth parameters are parameters of a normal/Gaussian distribution model. For other types of random variables, refer to MATLAB’s documentation on the type of distribution as well as the required model parameters.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s