What is the actual probability to have the washroom occupied during your flight? This should be relatively easy to answer. Suppose we have *N* people in a plane with *Nw* washrooms. We want to know *P(all washrooms are occupied)* and we will denote it with *P*:

We can safely multiply in the previous relation since the probability that a washroom is occupied is independent of other washrooms being occupied or not.

To calculate the probability that a washroom is occupied we need to know in average how many times a man uses the washroom and for how long.

We can infer the first value by knowing the normal volume of urination and the volume of the human bladder. Let’s call this *Np* and based on the previous links it can be anywhere from 3 to 6.

For the second parameter I couldn’t find any … informative links so let’s just call this *tw* and give it a reasonable value of 3 minutes.

The probability that a man will be in the washroom is:

Of course you can try your own calculations using your preferred values for *Np* and *tw*. Also I had friends arguing that my whole argument is not valid since they know people who stay more or less than my assumed values. Of course I’m aiming here for an “average” probability, and if you want to know exactly the value for your particular flight/home/work place, you can experimentally find the above probability per person and plug it in the equations below.

If we have *N* people in the same place and one washroom, the probability that the washroom is occupied will be:

since *P(nobody is in the washroom)* equals *P(person 1 is not in the washroom) and P(person 2 is not in the washroom) and … and P(person N is not in the washroom)* which are independent events and thus can be multiplied together.

Going back to equation *(1)* we can write now:

And to finish with a mental image here is a plot of the *P(all washrooms are occupied)* for different values of *N* and *Nw*. Considering that a reasonable plane capacity is around or above *N* = 200 you can see for yourself that there’s a good reason the washroom is almost always occupied.

In the modern justice systems there is also this concept called presumption of innocence. It is a very nice concept in theory but as usual in practice sometimes it’s not easy to follow it. Sometimes people just assume that if you’re on trial you must be guilty. Sometimes the trial involves powerful emotions an it’s easier to follow them then try to reason logically. As I said the jury is made of people, and people have their biases.

Let’s suppose the following scenario: there are N people in a jury, and they have to decide if the defendant is guilty or not. But the trial was not extremely conclusive (which is normal, because if the verdict would be obvious this post would have no purpose). So either the lawyers were really good (or really bad) or the evidence was not strong enough for any side, but the jury members have to basically give a guilty/not guilty verdict by chance. Of course each member of the jury will use its own rationalization to justify the answer that (s)he gave but I’m aiming for a probability problem here, not a sociology one.

In most cases a jury needs unanimity to give a verdict. So let’s add some more constraints to our problem. Let’s suppose that if the majority of the jury will vote one way (guilty or not guilty) they will convince the rest to vote the same way. Also if the guilty/not guilty votes are equal the jury will vote again (randomly) until we have majority in which case we apply the first rule.

So we can have multiple results of the jury vote depending on the number of members in the jury. If N is even there are 3 possible resulting scenarios:

- the defendant goes to jail if more than members of the jury vote that he’s guilty
- the defendant goes free if less than members of the jury vote that he’s guilty
- the jury will vote again if the votes are split equally. This scenario is here just for completeness since the jury will have to vote again

If N is odd we have only the first two. From now on I will consider *N = 12* for the purpose of getting some actual calculations done.

In this case each jury member will give the guilty/not guilty verdict with probability of 0.5. So the defendant has equal chances to end up in jail or free, which makes sense since the trial was inconclusive and there is no way to take a clear decision.

In this case each jury member will give the guilty/not guilty verdict with probability different from 0.5. An easy model for this is to imagine we have a big bowl with black and white balls, and the probability to extract a black ball (guilty) is equal to the bias of the jury. We then proceed and extract N times (with replacement) from that bowl.

The probabilities corresponding to the 3 possible scenarios described earlier are:

In this case each jury member will give the guilty/not guilty verdict with probability of 0.5 except one of them which is biased. An easy model for this is to imagine we have two big bowls with black and white balls, one with an equal number of black and white balls from which we extract (with replacement) *N-1* times and one from which we extract once where the probability to extract a black ball (guilty) is equal to the bias of the single jury member.

We can organize the possible outcomes in the following table (just to make it easy to see what terms we need to add):

1-p(bias) | p(bias) | |

C(11, 0)0.5^{11} |
free | free |

C(11, 1)0.5^{11} |
free | free |

C(11, 2)0.5^{11} |
free | free |

C(11, 3)0.5^{11} |
free | free |

C(11, 4)0.5^{11} |
free | free |

C(11, 5)0.5^{11} |
free | re-vote |

C(11, 6)0.5^{11} |
re-vote | guilty |

C(11, 7)0.5^{11} |
guilty | guilty |

C(11, 8)0.5^{11} |
guilty | guilty |

C(11, 9)0.5^{11} |
guilty | guilty |

C(11, 10)0.5^{11} |
guilty | guilty |

C(11, 11)0.5^{11} |
guilty | guilty |

The terms in the first column are the probabilities for possible outcomes of the (N-1) judge members who have no bias. The first row shows how the biased single member can vote. To get the probabilities of each possible scenario just add all the products as described in the table.

]]>The problem is the following: given that the 2 cards you have in your hand are of the same suit, what is the probability that you will get 4 cards of the same suit after the flop.

Since people learn from their mistakes, but smart people learn from other people’s mistakes, lets see how to calculate this probability. The way it works is that you know 2 cards so you have 50 left. Now, it can be argued that the other players have each 2 cards too, thus it will be less than 50 cards in the game. But from your point of view it really doesn’t matter if the other players have cards or how many players are there. Imagine the following scenarios:

- You have a pack of 52 cards. You give yourself 2 cards, then you draw some cards and just leave them on the table, then you draw 3 more and put them face up (the flop in Texas hold’em). Here it is obvious that you have 50 cards and you just draw 3 of them randomly
- You have a pack of 52 cards. You give 2 cards to you and a number of other players and then draw 3 more and place them face up. There is no difference in the way the cards are drawn compared to the previous scenario.

Of course if you would knew the cards of the other players you could calculate more accurately the probability that we’re looking for. But based only on the knowledge of your 2 cards you can use the first scenario regardless of the number of players at the table. It’s the same as if you have a dice. If you just know it’s just a dice you can say the probability to get the number *1* is 1/6. But if you would know in detail the mass distribution of the material of the dice, the initial position of the dice, and the initial force and moment you could give a more accurate probability to get number *1*. But this is already getting into the philosophy of probability so lets leave this for another post.

So we have 50 cards and we draw 3 of them. This means *C(50, 3)* is the total number of outcomes. But we already have 2 card of a particular suit and we want 2 more of the same suit. Since we are left with 11 cards of our suit of interest in the rest of the cards, and we want any 2 of them that means *C(11, 2)* favorable outcomes. And for the third card in the flop we can have anything else, so that means *50 – 2 = 48* outcomes. This gives us:

(Un)fortunately we had the right result, and it wasn’t what we got. The mistake was that for the third card we multiplied by the number of cards remaining in the deck. However there are cards of the same suit after we draw the 4 we were interested in. So what we calculated was the probability to get 4 **or 5** cards of the same suit after the flop. The requirement was to calculate the probability to get **only** 4 cards of the same suit.

So the correct answer if that for the third card we can have any card of another suit. That means 39 outcomes giving the right result of:

]]>
If all the above conditions are fulfilled, we can denote *p* as the probability of success and *q = (1 – p)* the probability of failure for one such event. For a sequence of *n* events we have the probability to be successful in *k* of them:

While this may seem a bit boring, we can get use the above relation for various probability estimates, as long as we apply it correctly, and the events are Bernoulli trials.

Let’s say you have to pass a test. There are 20 multiple choice questions, each one with 4 choices and one correct answer. What is the probability to pass if you do not study? That would be the probability to answer correctly to 12 questions from 20, with *p* being 0.25 (1 out of 4 possible choices):

A probability of 0.075% is really low, so you will probably fail without study. However this particular test also has other rules like: 3 questions are mandatory to answer correctly, or you fail regardless of the other answers. So let’s suppose you study somewhat so you’ll correctly answer those 3 mandatory questions. You probability to pass becomes:

which is 120 times larger that if you don’t study at all, but still low.

In the end the best idea to pass any test is to study for it, but in case you don’t feel like it at least you can estimate how much you should study based on how lucky you feel.

Let’s suppose we’re playing a game where you can win 1 dollar with probability *p = 0.5* and lose 1 dollar with *q = 1 – p*. The expected value of the game is:

so in the long run we would expect to neither gain, nor loose from repeatedly playing this game.

However let’s calculate the probability to win at least 4 dollars after 20 games. This is the probability to win 11 games (this means 11 vs. 9 so we gain 2 dollars) or the probability to win 12 games (this means 12 vs. 8 so 4 dollars). We can add the probabilities because the 2 events are mutually exclusive. According to what we discussed this means:

This looks like a good enough probability to win. For a run of 80 games we have:

So although the probability decreases with the number of runs it is still looks significant.

So how does this work? On one side we have the expected value of the game 0, so you expect to win 0 dollars on the long run but if we actually calculate the probability to win a couple of dollars in a series of repeated games we get big numbers. The answer is simple actually, there is a probability to lose at least 4 dollars and in this case it is equal with the probability to win, so you will probably get nothing in the end of a long series of games.

Always think of all the probabilities or you will miss important ones.

]]>
Example^{source}: A car is produced in 2 models: Model 1 and Model 2. Each model is produced in 2 designs: hatchback and wagon. Off all the cars produced, 30% are Model 1, and 20% are wagons. But 50% of Model 1 are wagons.

Let’s see this example in a Venn diagram.

I know Venn diagrams are usually made with circles, but I drew it with rectangles since it’s easier to represent the areas exactly. This diagram illustrates the basic rules of conditional probability:

and

The event A is said to be independent of B iff:

in other words, if the fact that B occurs (or not) has no effect of the probability of A. Now, are two mutually exclusive events (events that have no basic outcomes in common) also independent? Two mutually exclusive events are actually dependent of each other, since the occurrence of one of them means the second event cannot happen.

If events A and B are independent it follows that events A and B’ (the complement of B); A’ (the complement of A) and B; and A’ and B’ are independent also. A Venn diagram of 2 mutually exclusive events A and B has no intersection between them, while the diagram of 2 independent events A and B will have the following relation . This means – according to equation (1) – that the ratio of and is equal with the ratio of and and equal with the proportion of A in the whole sample space.

If we equal in equations (1) and (2) above, we get Bayes’ theorem:

Confusion of the inverse is the assumption that P(A|B) is the same as P(B|A). The classical example for this is the assumption that if you’re tested positive for a particular disease that means you have that disease. Let’s try to put some numbers on this example. Usually we know the following:

- P(ill) – the probability that a specific population has the disease. This is usually determined through statistics by recording the number of known cases of the disease. You can probably find this number from your favorite statistics agency.
- P(positive|not ill) – the false positive rate for the test (type I error). This is the probability that the test will be positive for a person that does not have the disease. This value is usually denoted by . Conversely P(negative|not ill) =
- P(negative | ill) – the false negative rate for the test (type II error). This is the probability that the test will be negative for a person that really has the disease. This value is usually denoted by . Conversely P(positive|ill) =

Let’s apply Bayes’ theorem now:

The people for which the test is positive can be split in ill or not ill. Since (being ill and testing positive) is mutually exclusive with (not being ill and testing positive) we can write:

and then by using equation (2) we get:

which is obviously different from P(positive|ill) = .

How different? Well for any reasonable good test we should expect and to be really small. So below you can see 2 graphs (3D because we have 2 independent variables and ) for 2 different values of P(ill).

While P(positive|ill) depends only on the parameters of the test, P(ill|positive) depends on the parameters of the test and also on P(ill). The difference between the two is larger when P(ill) is smaller, meaning if the disease is rare, it is less probable that one test will tell you that you are sick (unless it has amazingly low false positive and false negative rates).

Conditional probabilities are easy to guesstimate and difficult to calculate. And remember that usually in this case guesstimate is more of a wild guess based on your emotions at that particular moment then an estimate. When dealing with conditional probabilities it is better to think, calculate them, calculate them again and then look for an independent confirmation just to be sure that you’ve done it right.

]]>What are the probabilities of the different poker hands? First let me be specific in that I’m talking about the probability of getting one of the poker hands at the first draw from a full deck of cards. So we have 52 cards (no wild cards) and we draw 5, or we a number of players and we give 5 cards to each.

Both of the above situations are equivalent. There is no difference in the way you calculate the probability if you draw the first 5 cards or you draw the 2nd, 6th, 10th, 14th and 18th card from the deck.

Royal flush is ace, king, queen, jack, ten in the same suit. The probability is:

There is only 1 possible combination for each suit that can give you a royal flush, multiplied with the number of suits.

Straight flush is 5 cards in rank sequence and of the same suit. The probability is (including royal flush):

There are 10 ways of getting 5 cards from 14 in rank sequence (because we can have ace-high or an ace-low straight flush) for each suit, multiplied with the number of suits.

Four of a kind means 4 cards of one rank, and an unmatched card of another rank. The probability is:

There are 13 possible ways to get 4 cards of the same rank (that’s the number of ranks available) and there are (52 – 4) possible ways to get the remaining 5th card.

Full house means 3 matching cards of one rank, and 2 matching cards of another rank. The probability is:

There are possible ways to get 2 suits from the 13 available. This is because in this case it matters if for example we have 3 cards of clubs and 2 cards of spades or 3 cards of spades and 2 cards of clubs. There are and possible ways of getting 3 and respectively 2 cards from the 4 cards of the same rank.

Flush means 5 cards of the same suit, not in rank sequence, excluding the straight flush and royal flush. The probability is:

There are possible ways to get 5 cards from each suit and we have to subtract the 40 straight flushes (see above) from that number.

Straight means 5 cards in rank sequence but in more than one suit. The probability is:

There are 10 ways of getting 5 cards in rank sequence and combinations available for those 5 cards. As before we have to subtract the 40 straight flushes from that number.

Three of a kind means 3 cards of the same rank, plus 2 unmatched cards. The probability is:

OK, so there are possible ways of selecting 3 cards from 4 of the same rank, and we have to multiply this with the number of ranks. For the remaining 2 cards there are possible ways of selecting the rank so it would not match the rank of the first 3 cards, and combinations available for those 2 cards.

Two pair means two cards of the same rank, plus two cards of another rank (that match each other but not the first pair), plus one unmatched card. The probability is:

We have possible ways of selecting 2 different ranks from the 13 available, multiplied twice with which is the number of possible ways of selecting 3 cards from 4 of the same rank. For the remaining card we have only 11 suits available and 4 possible cards per suit.

One pair means two cards of the same rank, plus three other unmatched cards. The probability is:

We have possible ways of selecting 2 cards from 4 of the same rank multiplied with the number of ranks. For the remaining 3 cards we have possible ways of selecting a different rank and combinations available for those 3 cards.

For a final overview, the below pie chart shows all the probabilities discussed previously (all of them are in the chart, just that some of them are really small).

]]>

This problem is known as *gambler’s ruin* and was first proposed by Christiaan Huygens. Let’s plot the above result and see if we can draw some conclusions from this problem.

First, in the case of a perfectly fair game (*p = q = 0.5*, meaning no player has an advantage) the probability to win the total amount is dependent only on how much money you start with. In other words the one with the most money, has a bigger probability to win. Or if you prefer the corollary, if you want all the money in a game try to start the game with most of them in your pocket.

If the game is not perfectly fair it gets even worse. If *N = 50* dollars and you start with 20 dollars, the probability to win everything drops to below 0.1 if the probability of winning one game is just slightly against you (0.48). The probability to win everything decreases slowly after that (if the probability to win a game goes below 0.48) but that just shows that a small difference is all it takes, you don’t need to cheat to badly to get an effect.

So the lesson is simple. If you want to maximize the probability to win all the money in a series of games, try to start with (much) more money than your opponent(s) and try to have a bigger chance of winning the game than your opponent(s). This is the basic principle why professional gambling establishments cannot lose since they will make sure that both of the above conditions are met for them.

To find *P(N|i)* let’s draw a Markov chain. This is basically a graphical representation of the gaming process as a series of states (characterized by the amount of money you have won *i*). The future state is determined only by the present state and the probabilities *p* and *q*. The states 0 and N are special end states, meaning if you reach them the game has ended. Also in the 0 and N states we have the boundary conditions *P(N|0) = 0* (you can’t win the game if you have 0 dollars) and *P(N|N) = 1* (you have won the game if you have N dollars).

For any state *i* except 0 and N we have probability *p* to go to state *i + 1* (green transition) and probability *q* to go to state *i – 1* (red transition). So we can write the following:

We know that *P(N|0) = 0* and by writing equation (1) for each transition:

Adding all the above:

Writing the other boundary condition *P(N|N) = 1* for equation (2) we get:

Replacing (2) into (3) we get the relation we were looking for:

Of course this result is not valid for *ρ = 1*, but this case is easily demonstrated by replacing *ρ* into equation (2), where we get:

In this case the boundary condition becomes:

And the result for *ρ = 1 (p = q = 0.5)* is:

When nobody wanted to play with him anymore because he was winning, he changed his game a little: roll 2 dice 24 times and bet on getting a double 6. I this case his reasoning was: the chance of getting a double 6 for one roll (of 2 dice) is 1/36. So by throwing 24 time the probability would be 24/36 = 4/6 so he would still win (although it would take a longer time) but the game being different he could convince people to play again with him. To his total surprise he started to lose in this game. Since he could not explain this he asked one of his friends, Blaise Pascal to help him with the mathematics.

So let’s see the correct way to estimate the probabilities in this (simple) case.

First of all it is not correct to add probabilities in this case. If we have 2 events A and B then:

*P(A and B)* is 0 only if the events are *mutually exclusive*, that is there is no basic outcome that is common for A and B. In our case if event A is: get 6 on the first roll and event B is: get 6 on the second roll, there is a basic outcome: get 6 on the first roll and get 6 on the second roll, which is common to both A and B, so that means we cannot ignore P(A and B) in the formula above. This reasoning obviously applies to the remaining die rolls.

The correct way to calculate the probability is the following:

where

For 2 events:

**only** if the 2 events are *independent* that is A happening has no influence over B happening. In our case that means that getting 6 on the first roll has no influence over getting 6 on the second roll which is true.

So the correct probability for this game is:

Thus, Chevalier de Méré had almost 52% chance of winning (instead of his estimate of 66.6%) so we can say he was lucky in his probability estimation.

The expected value is a useful measure of the winnings/losses in the long run. Assuming that each participant is paying 1 dollar per game the expected value is:

As seen from the previous game, in this case the correct probability is:

and the expected value:

As indicated by practice (presumably) Chevalier de Méré would lose on this second game. If there’s a lesson we can learn from this is don’t estimate probability if you don’t know the rules, you will get it wrong even for simple 2 dice games.

]]>