What is a Random Variable?

A random variable is not a variable or random. It is a function that maps the output to the real numbers.

We will assume that the sample space is finite. Thus, given a random variable, F, from a sample space S, the set of numbers n that take the values of F is finite as well.

The probability that F takes the value N, in symbols (F=N), is defined as:

When defining a probability distribution P for a random variable F we often do not apply it’s sample space S but directly assign a probability to the event that F takes a certain value.

Thus we define the probability P(f=r) of the event that F has value R as: This is just basic probability. The probability of one single random variable is between 0 and 1. The sum of all random variables is 1.

Notation and rules

We write Where “,” is used as “and” & “and” is used as “intersection”

Conditional Probability

If p(F_2 = r_2) does not = 0 then:

The multiplication rule is also applicable to random variables

We sometimes use symbols distinct from numbers to represent the value of a random variable. Like F(weather = sunny).

Probability distrubtion

The probability distrubtion for a random variable gives the probabilities of all the possible values of the variable. Assume the order of the variables is fixed then:

Joint Probability Distribution

Let f1,…,fk be random variables then a joint probability distribution for them gives the probabilities P(f1=r1,…,fk=rk) for a domain of interest.

Full Joint Probability Distribution

A full joint probability distribution is a joint probability distribution for all relevant random variables f1,…,fk for a domain of interest.

Every probability question about a domain can be answered by the full joint probability distrubtion because the probabilirty of any event is a number of probabillities.

Note: n1…nk are often called data points or sample points.

A full joint probability distrubtion will only have information about a domain of interest. A non-full distrubtion could contain information about a domain you don’t care about.


Given a joint distribution P(f1,…,Fk), one can compute the unconditional on marginal probabillities of the random variables Fi by summing out the remaining values.

Conditional / Posterior distrubitions

We can also compute conditional / posterior distributions from the full joint distribution. We use the P notation for conditional distributions.

P(F G) gives the conditional / postieor distrubtion of F given G given by the probabilities P(f=r G=s) for all values r and s.

Using this notation, the general version of the multiplication / product rule is:

P(F, G) = P(F G)P(G)

Probabilist Inference

Can be charecterised as the computation of potential probabilities For every variables F given derived evidence E_1,…,E_2.

The denominator can be viewed as a marginalation constant for the distrubtion P, ensuring that it adds up to 1.