Interpreting Small Sample Sizes--Bayesian Estimators

Interpreting Small Sample Sizes--Bayesian Estimators

It often happens in heads up sit and go that we have to make the most of the limited reads we have. One common situation is that the villain has opened his button some X out of N times. Our HUD says his opening frequency is X/N, but with low N this is of limited to no value. This article describes a method of using population tendencies to get an accurate estimate of a villain’s opening frequency, O, given X and N. With such a frequency we can put the villain on a range of hands. This will lead to better decision making.

The statistical object of study when estimating a parameter is called an estimator. In this case we are trying to find the best estimator for a villain’s opening frequency given a small sample of villain opens. If all we know is that someone raised four out of five buttons, then we have data in the form of a set of outcomes, e.g. {1,1,0,1,1}, where 1 = raised, 0 = limped or folded. In this case X = 4 and N = 5. Data of this form are best represented by a Bernoulli distribution. Given this, the best estimator for the Bernoulli distribution will be X/N--exactly what our HUD gives.

The reason this is not an accurate estimate is that X and N are not the only things we know. We know that we are playing against a villain from a population of players about which we have specific information. In order to be good poker players we need to think about population tendencies--therefore we must use Bayesian methods. Bayesian statistics involve one crucial datum: a prior distribution. The prior distribution will allow population tendencies to affect our estimate and let us make a Bayes estimator of the villain’s opening frequency.

The first step is to construct the prior distribution from the population tendencies. The process of constructing a prior distribution can be very subjective; in this case I will focus on computational simplicity whilst allowing for intuition. Thus I use use the most convenient form of prior distribution possible: the conjugate prior to the Bernoulli distribution of our sample. This is the beta distribution.

Two values--called hyperparameters--fully describe the beta distribution. The literature commonly notates them as alpha and beta, or shape and range. It is best to think of these numbers as pseudo-data that represent the population tendency: alpha is the number of opens, and beta is the number of limps and folds. We’ve seen the villain play N hands, and we can add in imaginary hands to represent the population tendency.

It is possible to compute alpha and beta in many ways. One could use a large sample of villains and put their opening frequencies into a calculator to get the best beta distribution fit for the data. This is an acceptable approach, although its applicability is limited as it can be difficult to find the correct filters in tracking software. Also, the prior distribution may suffer from skew due to varying sample sizes against individual villains.

Instead, I prefer an approach that allows more space for intuition. In order to compute our two hyperparameters, we use two more intuitive values: mean and standard deviation. We can easily calculate mean from any poker tracking software: filter for representative blind levels, and divide the number of hands you have in BB by the number of times you faced an open. I play hyperturbos and have a large database, so I filtered for 22-25bb and came up with 59% as my mean. Another way is to decide what range you believe represents the population tendency, and calculate its size. Next, we estimate a standard deviation which I use because it too is an intuitive quantity. Just consider what the widest range the average villain opens, and compute the difference from the mean. I think it’s about 10%--I very rarely see random villains open 70% of hands, or less than 50%. To get a better idea, you can open PokerStove or ProPokerTools and determine what a 59% range is and then decide how much wider a random villain can open and that percentage difference is your standard deviation.In general, the less certain you are of your mean, or the more spread you think is in your population, the higher you should estimate your standard deviation to be.

Once you have these two values, we can do a bit of math to compute alpha and beta by using the formulas for mean and standard deviation of the beta distribution. I did the algebra for you and ended up with the following formulas where M = mean and S = standard deviation:

alpha = (M2- M3) / ((M+1)*S2)

beta = ((M-1)2*M)) / ((M+1)*S2)

This isn’t in game math, so just plug it into Excel and you shouldn’t have any problems. I end up with alpha = A = 9 and beta = B = 6.2.

Finally we have our prior distribution--fully represented by the two numbers A and B--and we already have our sample (remember, villain opened X times out of N), so now we’re ready to compute our best estimate of the villain’s opening frequency O. Using our prior distribution ends up simple--just pretend that before you played agains your villain you observed him opening A times and limping or folding B times over a sample of A+B hands. Use these pseudo-data and combine them with our new data X and N and pretend they are all one Bernoulli distribution. This means that the formula for estimating O end up:

O = (A+X)/(A+B+N)

I used this formula with the A and B I calculated earlier to make charts for a few different N. Note though that this is data based on a very specific prior distribution: PokerStars hyper HUSNGs mostly at the $100 level. The values should provide reference points to keep in mind when looking at your HUD early in a match. In Part 2 of this article I will use the chart to look for adaptations we can make in our pre-flop ranges, as well as apply look to apply this method to find a few more frequencies.

Coffeeyay husng poker article bayes table

Your rating: None Average: 5 (3 votes)


pocaja's picture

Interesting article. who is

Interesting article. who is the author?
dinamozg's picture

Sounds like Mercennary??

Sounds like Mercennary??

dinamozg's picture

I have one question: in that

I have one question: in that formula for alpha and beta what is M2, M3 and S2?

dinamozg's picture

Ahh sorry I figured it out:

Ahh sorry I figured it out: those numbers stand for squared and cubed. Great article I like it very much!

coffeeyay's picture

I'm the author! :) And i'm

I'm the author! :) And i'm not Mers, sorry. I'll be putting up more information about myself shortly, as well as making some videos. I am an active poster on 2p2 using the same SN and I play 100s HTs on Stars with the same SN.


Yes, the formatting of the formulas messed up, they should have had superscripts to do the exponents. They should look like this: alpha = (M^2 - M^3) / ((M+1)*S^2) beta = ((M-1)^2*M)) / ((M+1)*S^2) with the ^ signifying exponentiation.Cheers 

Quimp's picture

Assuming normal distribution,

Assuming normal distribution, doesn't Mean +/- 1 Standard Deviation represent only ~68.2% of population? If you "very rarely see random villains open 70% of hands, or less than 50%", saying that nearly 1/3 villain won't be in that range seems off. For example if you think 5% of villains deviate from your hypothesised range, then you would use Mean +/- 1.96 stddev, and in your example I'd use S ~= (70%-50%)/4 = 5%.

Great article btw!

coffeeyay's picture

I think that there is a

I think that there is a fairly good amount of spread in population tendencies despite being fairly centralized. I think 68.2% is a pretty big chunk of the population tbh, and so for me that 1/3 is fine to call as outliers. It makes sense to me that 80%+ raisers are very rare whereas 70%-80% happen but are uncommon, same with 40-50% and 30-40%. For instance, when you account for guys who limp a lot you'll find that there will be a fairly good chunk opening less than 50%, and when you account for regs and very aggro guys a good chunk will open more than 70%. I tried to use a fairly conservative assumption, but you're right that we can definitely use a smaller SD if we're more confident that most villains will be in that 50-70% bubble.

But if you think it's more accurate to say that 5% is a reasonable spread then feel free to plug and chug it with a different SD and see what it does to the results--I think you'll find that we'll need much higher values of N to significantly adjust our estimate away from the mean.

However, I'd be cautious because I think it's better to put more uncertainty into our estimates rather than less--especially since the beta distribution itself may not be the most accurate model of the population. I picked it for its convenience  in calculating and not because it is the most accurate model--I think that a multi modal model, for example, could possibly be more accurate as then we could take into account a few different standard play styles. One way to try to do this without losing the simplicity of the beta distribution is that could even do this process twice and make two seperate estimates--one for regs and one for recreational players--and then be able to use smaller SD. We would have to rely on some other process for deciding between a reg and recreational player though (like SharkScope).

Quimp's picture

Awesome, thanks! I agree 95%

Awesome, thanks! I agree 95% is way too confident.

GoGhostman's picture

math scares the shit out of

math scares the shit out of me =D

dompoma's picture

Love the article! Thanks for

Love the article! Thanks for sharing this. Im a bit confused on finding the standard diviation number for the population. That number has such a large effect on the bayesian estimator, it seems crucial to make it acurrate. How would you find the standard deviation for the population check raise flop frequency if your data says the mean is 20%. Some opponents almost never check raise at and some check raise 35%.

FCDplayer's picture

Great article, keep it going!

Great article, keep it going!

Nichlemn's picture

Good article, I've been

Good article, I've been looking for something like this for a while.

It is somewhat inaccessible, though. I would give a brief description and the tables to start with, then include a mathematical appendix.


quinn132's picture

My observations (all be it at

My observations (all be it at smaller stakes) is that opening frequencies do not sit on a bell curve. Random villains are generaly over agro or passive, I am not convinced that a population graph would peak at the mean average. I intuitvely believe that such a graph would be more of an M shape. I have nothing to back this up and if you have any evidence to show otherwise i would love to see it.


swish's picture

Beta distribution?

I've been playing with beta distribution a little for different mean and standard deviations and noticed that it goes to 0 on the edges (at 0 and 1). I've seen players that at >20bb for example never raise (only limp or fold) or never 3bet or open 100%. Beta distribution does not "allow" that kind of players. But with bigger standard deviations at one point distribution drasticaly changes and flip around and have infinite value at one or both edges.

WesDone's picture

Great stuff, thanks. Going

Great stuff, thanks. Going through Coffeeyay's math pack right now; he said he would publish a part 2 of this article. Is it out?

RyPac13's picture

Thanks! Part 2 ended up being

Thanks! Part 2 ended up being in the video pack basically, so he didn't write anything up.

adam25185's picture

Factoring in pop tendencies

Factoring in pop tendencies is helpful for circumstances where the standard deviation is relatively low... i.e. villains are almost all opening similar ranges. To be fair, this is probably the case vs a preflop minraise in the bb, esp for hypers in the 100$ division... where players are  usually well drilled.

But I don't think this analysis can be applied to less frequently observed situations. When stacks are short, opening ranges are varied, particularly at the lower limits. And applying this analysis to later street situations, like c/r shove river, or donk turn, it going to be nearly impossible... because play in these spots varies so much from villain to villain.

I've been thinking about sample sizes in these types of situations recently. And I've noticed that many regs seem to want a large sample size of ~10 on a given statistic in order to adjust, relying on default strategy until we have accrued that data. That's a lot of hands!

It appears to me this is a big mistake... simple binomial distribution tells us that, neglecting pop tendencies entirely for a minute, when villain opens 5/5 buttons at 12-15bbs, there's only a 8% chance he's opening less than 70% of his range. We should probably shove over pretty damn wide!?