I was doing some simulation and I needed a distribution for the difference between two proportions. It’s not quite as straightforward as the difference between two normally distributed variables and since there wasn’t much online on the subject I thought it might be useful to share.

So we start with

We are looking for the probability mass function of

First note that the min and max of the support of Z must be since that covers the most extreme cases ( and ) and ( and ).

Then we need a modification of the binomial pmf so that it can cope with values outside of its support.

when and 0 otherwise.

Then we need to define two cases

1.

2. $latex Z < 0 $
In the first case
$latex p(z) = \sum_{i=0}^{n_1} m(i+z, n_1, p_1) m(i, n_2, p_2) $
since this covers all the ways in which X-Y could equal z. For example when z=1 this is reached when X=1 and Y=0 and X=2 and Y=1 and X=3 and Y=4 and so on. It also deals with cases that could not happen because of the values of $latex n_1 $ and $latex n_2 $. For example if $latex n_2 = 4 $ then we cannot get Z=1 as a combination of X=4 and Y=5. In this case thanks to our modified binomial pmf the probablity is zero.
For the second case we just reverse the roles. For example if z=-1 then this is reached when X=0 and Y=1, X=1 and Y=2 etc.
$latex p(z) = \sum_{i=0}^{n_2} m(i, n_1, p_1) m(i+z, n_2, p_2)[l\atex]
Put them together and that's your pmf.

Here’s the function in R and a simulation to check it’s right (and it does work.)