8

I'm having some difficulty understanding Bayes' theorem with multiple events. I'm trying to put together a Bayesian network. I have four independent probabilities but I have found that A, B and C can affect the probability of D, so my question is: how do I write the formula for $p(D | B, C, A)$?

As an example, suppose I was given: p(A) = 31.6%, p(B) = 71%, p(C) = 4.5%, and p(D) = 22%, how would I get $p(D|B,C,A)$?

Chill2Macht
  • 20,920

1 Answers1

7

We can look to Bayes formula for inspiration. It can be derived from the definition of the joint distribution:

$$P(A, B) = P(A|B) \,P(B) = P(B|A)\,P(A)$$ and rearraning to give $$P(B|A) = \frac{P(A|B) \,\,P(B)}{P(A)}$$

For the case of 4 variables, we have many more options. Below is one example of a formula

Example

We could write the joint distribution as:

\begin{align}P(A,B,C,D) =& P(A | B,C,D) \, P(B,C,D) \\ =& P(B | A,C,D) \, P(A,C,D) \\ =& P(C | A,B,D) \, P(A,B,D) \\ =& P(D | A,B,C) \, P(A,B,C) \end{align} To see where this came from, in the first line, I'm treating the event random variables $(B,C,D)$ as a single random variable, so it's similar to just writing $P(A,Y)=P(A|Y)\,P(Y)$ where $Y = (B,C,D)$ and is a 3-dimensional random variable.

Therefore, one option is to rearrange the above to get

$$P(D|A,B,C) = \frac{ P(A | B,C,D) \, P(B,C,D) } {P(A,B,C)} \tag{1}$$

If you're happy with this, you can stop here.

Example (cont'd)

In the above formula, you may notice that $P(B,C,D)$ and $P(A,B,C)$ can be further broken down if desired. Similar to Example 1, we could do \begin{align}P(B,C,D) =& P(B|C,D) \, P(C,D) \\ =& P(C|B,D) \, P(B,D) \\ =& P(D|B,C) \, P(B,C) \end{align} I'm going to arbitrarily choose the first line of the above equations, but notice that $P(C,D)$ can be further broken down to give $$P(B,C,D) = P(B|C,D) \, P(C|D) \, P(D)$$

In a similar fashion, we could choose to express $P(A,B,C)$ as $$P(A,B,C) = P(A|B,C) \,P(B|C) \,P(C)$$ (in the context of probability, the above equation is called the Chain Rule)

Substituting these back into Eq (1) yields

$$P(D|A,B,C) = \frac{P(A|B,C,D)\,P(B|C,D)\,P(C|D)\,P(D)} {P(A|B,C)\, P(B|C)\, P(C) }$$

Conclusion

I made a lot of arbitrary choices in how I expressed the joint distributions above. Therefore there are many different formulas you could have ended up at.

Note that just like in the normal Bayes' theorem where we need more than $P(A)$ and $P(B)$ to get $P(B|A)$, here, we also will need more than just $P(A)$, $P(B)$, $P(C)$ and $P(D)$ to get $P(D|A,B,C)$.

Garrett
  • 780
  • 7
  • 21
  • This confuses events and random variables. The question being about the former, mentions of joint distributions or tuples of random variables should be replaced by intersections of events. For example, $(B,C,D)$ is simply irrelevant and the formula $$P(A,B,C,D)=P(A\mid B,C,D)P(B,C,D)$$ actually means $$P(A\cap B\cap C\cap D)=P(A\mid B\cap C\cap D)P(B\cap C\cap D).$$ On the other hand, the last sentence of the post is spot on. – Did Mar 26 '16 at 09:19