i just tried to add 2 functions to the range class, with the goal of being able to remove certain hands from a range.
first, i wanted to be able to enter the hands as a list of strings (i.e. "AKo, 77, 58s", etc.)
so, to convert the list of string to a cardslist containig numbers, i added a modified version of the setRangeString function inside the range class, which looks like that:
# inputs: rangeString - string containing comma-separated terms of the form XX, XY, XYs, XYo, XaYb
# outputs: range as a list of numbers
def convRangeToCardslist(self, rangeString):
result = [] <--- added
handStrs = rangeString.replace(' ','').split(',')
for hand in handStrs:
if len(hand) == 2:
rank1 = hand[0]
rank2 = hand[1]
for i in suits:
for j in suits:
if rank1 == rank2 and i == j: # avoid stuff like 2c2c
continue
result.append(pe.string2card([rank1 + i, rank2 + j])) # <--- I just changed these lines
elif len(hand) == 3:
rank1 = hand[0]
rank2 = hand[1]
if hand[2] == 's': # suited hands
for s in suits:
result.append(pe.string2card([rank1+s, rank2+s])) # <----
else: # offsuit hands
for i in range(numSuits):
for j in range(i+1, numSuits):
result.append(pe.string2card([rank1+suits[i], rank2+suits[j]])) # <----
elif len(hand) == 4:
card1 = hand[0:2]
card2 = hand[2:4]
result.append(pe.string2card([card1, card2])) # <----
else:
print("ERROR!")
return result <--- added
and then i wrote another additional function inside the range class
# inputs: rangeString - string containing comma-separated terms of the form XX, XY, XYs, XYo, XaYb
# outputs: n/a
# side-effects: removes hands from range that are given in rangeString
def removeSelectedHands(self, rangeString):
selHands = self.convRangeToCardslist(rangeString)
zeroHandsWithConflicts(self.r, selHands)
in which i am making use of the already existing function zeroHandsWithConflicts, which should set the cards contained in the variable selHands to 0 and therefore exclude from the range.
In the above function, for the case XYo, I don't agree with:
else: # offsuit hands
for i in range(numSuits):
for j in range(i+1, numSuits):
as for a hand like 78o, the range is only half of all combos as indicated by light green. 7h8c will be included but not 7c8h... instead we want to avoid same suit as 7h8h, so I changed it to:
else: # offsuit hands
for i in range(numSuits):
for j in range(numSuits):
if i != j:
... ...
Great video, impressive to code and comment so clearly at the same time with so little errors!
Thanks, glad you enjoyed the vids. Yea, just a reminder that a couple issues (such at the one mentioned here) are corrected in the errata section at the top of this page.
In convRangeToCardslist, it looks like the return statement is indented such that it's inside the for loop. So, the body of the for loop will only execute once before the return. I dont think this is what you want.
Second, the zeroHandsWithConflicts function expects as input a list of cards, but you're giving it a list of lists of cards. Anyhow, that function's purpose is to remove all hands that have any individual cards in common with the given cards, which is probably not what you want.
Anyhow, if I understand what you're trying to do correctly, I think the code we wrote already supports it. Our setRangeString(self, rangeString, value) function sets the fractions of all hands referenced by rangeString to value. If you want to remove the hands, use value=0. So you could say,
# just wanna give a quick review (its very unique that i give reviews)
# today i finished all the material, took me three days (with some background basics in programming)
# i really liked the series, although it gave me some headaches, finding all the mistakes that i have made
# at the end you guys will have a pretty strong tool for finding optimal strategies in every (manageable) subtree
# a very nice side-effect is that you learn python (don't be afraid, everything will be described very well)
# will, could you give some notes on adding another player ('BU') ? i think holdem-resources uses FP for 3+-handed trees as well. do you have some ideas for that task?
# p.s. your youtube-promo brought me here, so keep it up!
I'm glad you enjoyed the videos :). I've never written a 3+ player solver myself, since I'm really just interested in HU play, but I believe you're right about holdemresources. So, doing FP for 3 players seems like a very straightforward generalization of the 2-player case, at least conceptually. You're still just solving for the players' max expl strats in sequence, and you just add another player in the rotation. I don't see any real difficulties extending our approach to equity calcs, decision trees, etc. I'm not completely sure that FP is guaranteed to converge in the 3+ player case, but it probably is.
All that said, there are some significant theoretical issues with the idea of "GTO" play in 3+ player games. In heads up, playing an equilibrium strategy gets you a pretty strong guarantee -- non-negative expected profit (on average over both positions and neglecting rake). That's not true in 3+ player games -- you can play equilibrium strategies in all positions and still lose money. Also, in the 3+player, there could be multiple equilibria with different values -- some could be better for you than others. This is also not true heads up.
Overall, it makes a lot less sense to try to play equilibrium strategies in the 3+ player case, imo, despite the fact that lots of people like to talk about it these days...
you are right, there are multiple equilibria in 3+ player games, and yes, some are better than others. i read some papers about that theme and started with a simplified poker-variant named kuhn poker. the two player case worked pretty well and it was no problem after watching your series. however, the three player case seemed harder, actually i had/have problems with equity calculations for the three player case and card removal effects. apart from this the soluton i got is pretty okay and near to the solution i had found in the papers. now i want to solve real poker for three player. at first i need the equity array, but this time for three hands. you said that even for two players this takes two months to calculate, so i think for three players this will be even longer. do you have any ideas to get over this? is there any person who had done this before?
You want to calculate 3-player equity arrays? Not something I've done before, but here are a few ideas:
- Python can be kind of slow for various reasons. For our EA generation algorithm, I'd imagine you'd get a speedup of about 10x by just writing essentially the same code in C/C++.
- Take advantage of some symmetries. We were really dumb in our approach (because it doesn't really matter) -- I want to say we calculated the equity of QsJs vs 8h7h separately from the equity of 8h7h vs QsJs. So that's a factor of 2 we could cut, and it's a factor of 6 in the 3-player case. And there's tons of other stuff like that. No need to calculate the equity of 8h7h and 8s7s differently vs anything on a board w/ no hearts or spades flushdraws possible.
I believe PokerStove was recently open sourced, and I believe it did a lot of these sorts of things to get fast multiplayer range-vs-range equity calcs. Check it out --
in the video you said that the preflop array will take several weeks to be generated. how did you solve this? did you have access to a large computer array?
also: for me the next step would be to rewrite the code for omaha. but it seems utterly impossible because of the size of the arrays, since there are 270k possible 4 card holdings (now square that for hand vs hand matchups). do you think it could be managable somehow?
The version of the preflop equity array that I distributed came from just leaving it running for a few weeks. That's how I know how long it took :p.
Yea, everything's going to be a lot bigger/slower in omaha. However, you can gain a lot over the naive approach by taking advantage of some symmetries. In our equity calcs (both in generating the equity arrays and later when we use the equity arrays) we find or lookup the equity of every hand combo vs every other. I.e. we find the equity of AhKs vs Qd8d separately from the equity of AcKs vs Qd8d, etc. If you're smart about it, you can avoid a lot of extra work that way.
Similarly with ranges -- if you take the same simple approach to storing a range, it'll take tons of memory since there are so many hand combos. Well, in holdem, preflop, we could have just stored one number for, e.g., AKo, instead of 12 separate numbers for AhKs, AhKd, etc. And then even postflop, a lot of combos are still equivalent or mostly equilvalent. E.g., if the flop is rainbow, then all your AKo combos are the same (except maybe with different blockers to Villain's backdoor flushdraws). So you could probably get away with just using one set of frequencies to describe AKo's play on that flop instead of 12.
I think the potential savings from this sort of thing are quite a bit bigger in omaha than holdem, although taking advantage of them definitely makes things more complicated.
I don't solve a wide variety of games on video (just another preflop-only game and then a turn and river situation), but the code lets you set up and solve arbitrary decision trees with arbitrary starting ranges. If you've not read the books or want more info, please lmk, and I can say more.
I am not sure . Let's say someone bets on the turn and I need to decide whether to call , raise or fold .
It can help me decide this ? How long does it take to calculate approximately ? I've read your first book so I understand the equilibration thing .
Also , how often can you expect to win a heads up SNG using optimal strategy against the typical opponent ( not very strong let's say ) ? I was told that the best that can be done is 55-60% . Is that correct ?
I am not sure . Let's say someone bets on the turn and I need to decide whether to call , raise or fold .
It can help me decide this ? How long does it take to calculate approximately ? I've read your first book so I understand the equilibration thing .
yes, you can build a model to approximate a solution. you have to keep in mind though that the more decisions players can potentially make, the more difficult it is to build a model. for example: PF shove/ fold is solved. if players only have the choice to either shove/fold and call/fold, the solutions are simple to calculate and they will be precise. if it´s a two street model with plenty of chips behind, it can get complicated. you have to simplify the model to get a result out of it. i just did a two street solution with 101 nodes for the decision tree and it took the computer to calculate the equilibrium about 30 minutes (i didn´t stop the time, it´s just a wild guess actually). that´s an ok time for such a complex simulation i think.
Regarding good winrates, it depends on the HUSNG format -- something like 53% would be very good at hypers. At regspeeds, maybe more like 58% would be possible.
I am going away for a month. I can de-register on my PC then reregister on another computer when I am away to watch the videos. Then I can do the same on the way back. is that correct?
I am going away for a month. I can de-register on my PC then reregister on another computer when I am away to watch the videos. Then I can do the same on the way back. is that correct?
Yes, that would work. You can deactivate / "de-register" a given key for the pack on your computer, activate / "re-register" on another computer while you are travelling, then deactivate on that computer before you depart to return home, and activate again on your home computer upon your return. There are a few things you should keep in mind if you choose to pursue this:
•There is only a limited number of deactivations available per key: 10. If you use up the available deactivations, you will be stuck with no way to deactivate. Therefore, our advice is that you exercise prudence in deciding when and where your deactivations are worth it.
•It is important that you don't forget to deactivate on the other computer before you leave for home. There is no way for us to deactivate a given key on a given computer remotely.
•Keep in mind that each key is good for two concurent activations. If you have only activated your key on one computer so far, you don't need to deactivate on that computer in order to activate it on up to one other computer. This might save you a deactivation. You will still need to deactivate on the other computer when you leave that computer (if that other computer is not a computer you plan to use again later).
I hope that answers your questions. Please let us know if you have any others.
I am really enjoying the videos but is there not a place to download an actual completed notebook?
I have had to start over twice now but have an issue getting the range class to print nicely. Spending days looking for a typo is not going to work for me.
Hey, we made a decision to provide a video series and not a finalized piece of software intentionally. Debugging is a part of every nontrivial software project -- it's part of the experience. If some of your code doesn't work while you're going through this video series, and you have to set out to fix it, please don't think of that as an unfortunate detour. On the contrary, it's an important and valuable part of the process. If you just copy down everything I write, and it works perfectly, you haven't gotten your money's worth :).
Also, simply comparing your code to that in the video is not the most efficient way to debug. I'm going to put some thoughts on debugging methodology here:
I've seen some mention of solving poker with linear programming, don't remember if it was here or somewhere else. Could you say a few things on how people are using linear programming to solve poker? I'm quite familiar with linear programming and discrete optimization, but I've never applied it to poker problems specifically. I don't think it's immediately obvious how it would improve the solvers -- I suppose I could see it working to bound the search space in a branch and bound algorithm. But intuitively, FP should be doing reasonably well, and I'd think that equity calculations quickly end up being a major bottleneck anyway. So how would we benefit from LP?
Hey, great video series, I´m halfway through and will give some more feedback later on.
Short question: is there any "convenient" way of getting the pokereval-lib to run on a Mac? I read that somebody else managed it ... but he didn´t say, how he made that?
The purpose of the pack is to code the program yourself, you'll learn a lot along the way. The pack is designed to make it quite doable though, many people have done it already and found the value in it (feel free to chime in about this please buyers).
First of all - THANK YOU SO MUCH, Will!! This is simply a masterpiece of work - and I enjoyed every single bit of it.
I have some programming background (I´ve been pretty busy in programming the old days, with all the "old" stuff, like Pascal, Java (beginning with 1.0) and C/C++. I´ve never put my hands on Python though, I did not even know anything about it, just the name I might´ve heard somewhere ...
After Will had introduced the Range-class, I wrote a method that can read / interpret range strings like "22+, KTo+, AJs+, 9h8h, 87s" and transform a given range-class back to that short string representation as well.
After the 6th video (iirc), I´ve already programmed a function that automatically generates Decision Trees, based on given parameters, like street, pot/cip-sizes, betsizes, possibility to call or not and such, so I can pre-compute any possible game tree - and then fine tune it to get the special situation I´d like to work with (by removing nodes, copying complete subtrees and such). The possibility of displaying the tree in different schemes was just the creamtopping:
Additionally I programmed a routine that creates EquityArrays on multiple cpu's, so I improved the creation time of new EquityArrays by factor 4+.
Generally I always tried to be at least one video ahead and avoid to just get spoonfed. That said, I solved the shove/fold-game like two videos before Will gave the solution. I programmed the ficticious play routine entirely independent and only took a look at the video to see if there´s sth. I´ve might been missing.
Will is just a great explanator, he explains things just in the right speed, doesn´t bore you with basic stuff that you can work off the videos on your own way better, he shows you the absolute cornerstones of what he wants to get into our heads - and as you can see, it worked great. Admittedly, my programming background probably made things way easier, but that should be clear. If you´ve never programmed before, you´ll have a way harder path ahead of you, but still, the simplicity of iPythong along with the "simplicity" of Will's explanation will get you on track pretty soon - if you´re serious about it.
The form in which the videos are created is just flabbergasting. Will makes it look like as if he writes the programs from scratch - which is ovbiously impossible, he has precoded / tested them and then re-wrote in the videos, but that way is just a great way to get us along the path without getting lost in the wide universe of developing code from scratch with all the basic preparation / dead ends, re-implementations etc.
Finally the price - I promise you, it´s a steal. Had I known what I´m finally getting, I easily had paid like 5x as much. If you´re on the fence if or if not to buy this video pack - do it. It´s hard to imagine you´ll get disappointed if you made it that far.
Will, thanks again for the great work!
You earned a big fan and I´m so happy and excited about the literally unlimited possibilities that wait for me now! :-)
Any suggestions on how to optimize the number of iterations in the doFP function ? I've been trying to figure out how to tell when the next iteration is not that
different from the previous one but I don't think that will work as the changes are gradual from one iteration to the next but the 100th iteration is drastically different from the 50th .
Is there anyway to know when it has gone as far as needed ?
I´m currently working with "max nash distance". By that I mean I calculate the difference between the max. expl. strategy against opponent's strategy and the currently realized EV. Then I take the max. absolute difference from both players. If that passes a defined treshold, I stop the iteration.
Example (pseudo code):
pot = potsize at root decPoint
curStratHero = current strategy of Hero
curEVHero = EV (at root decPoint) of curStratHero vs. curStratVillain
maxExplStratHero = max. expl. strat of Hero vs. curStratVillain
maxEVHero = EV (at root decPoint) of maxEplStratHero vs. curStratVillain
nashDistHero = maxEVHero - curEVHero
<repeat for Villain>
maxNashDist = max(nashDistHero, nashDistVilain)
if ((maxNashDist / pot) < 0.05):
# Nash Distance sufficiently low (below 5% of starting pot)
break
Is curStratHero the maxExplStratHero from the previous iteration ? If so , this is what I was trying until I realized it doesn't do much . There is not much difference between iterations and the EVs keep changing from one iteration to next . It takes many iterations for a big change to occur . For example let's say EV = 50.55 current iteration . Next iteration it is 50.50 so by your criteria we are done . But how do you know that in 100 more iterations , EV won't be 45.50 ? How do we know 0.05 is a good threshold ?
maxNashDist doesn't seem to be converging to a small number for me . If it is , then it is taking many 1000s of iterations and I am hoping for a much smaller number than that .
OK , if I look at the root decision point only , I do see some good convergence . After 100 iterations , the EVs are differing only by .001 . But other nodes are not converging even after 500 iterations . Looking at the dump at 100 iterations and the dump at 200 , there is still quite a bit of difference in the dumps . So I am still not clear on what a good threshold is .
If I look at the dumps , it seems some hands are just slowly converging to 0% of the range but I can't know for sure until I do 10000 iterations .
If a hand is going to be 20% of my range or 0% of my range , I would think that is a big difference in my overall win rate . But doing 1000s of iterations is just not practical to find out .
Mmmh ... not sure, what you mean? If we define a "strategy" as the sum of all decision on any nodes, why are you interested in any other node than the root DecPoint? Like, say we are on the turn, you have a range X. You can bet, x/c, x/f, x/r, and depending on what you do, there'll be a river, where you have even more options. Your goal should be to maximize your overall EV at the point of decision - namely the turn, on your very first decision. It does not matter, wether you have a giant +EV spot later on, if you buy that with massively -EV-moves on other nodes (resulting in lower overall EV). Clear, what I mean?
That way, I´d say, the only real information about how far we got should be the root DecPoint.
Is curStratHero the maxExplStratHero from the previous iteration?
No, you likely got me wrong. The maxExplStrat is the CURRENT max. exploitive strategy against Villain's currently chosen strategy (= ranges on all nodes). This is likely not what we're doing though, we just merged a part of that strategy into our current strategy.
Example: I have {AA, 32s} on 77766 river. You have {99}. I shove potsize. Nash strategy for both players would be {AA, 32s[75%]} for me and {99[50%]} for you.
Now, suppose, my current range (at iteration X) would be to shove everything, your current strategy would be to call 60% (of your 99). In that case, our current EVs would be:
Well , if I am going to understand what you are saying it is going to take me long study . But it is an interesting point you make that the root node is what counts .
Will , is there anything you can add to this discussion ? Do you concur with Tackleberry's conclusions ?
OK I am only getting 8 iterations ( that would be great , if it is correct ) ...
So not sure if I am doing this right :
1. I take HERO = SB and Villain = BB .
2. I get SB's current EV from the evs["SB"][0] of the strats object . Similarly for BB . ( I sum over the evs and average it )
3. after the call to setMaxExplEVs("SB","BB") I get SB's max EV from the evs["SB"] [0] of the strats object .
4. after the call to setMaxExplEVs("BB","SB") I get BB's max EV from the evs["BB"] [0] of the strats object.
I recently purchased this pack - awesome pack by the way - and have encountered the following error whilst on video 7 Range Class, trying to display the range:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-24-ec23ba4cad64> in <module>()
1 bob = Range()
----> 2 bob._repr_svg_()
<ipython-input-23-be2423c192a7> in _repr_svg_(self)
138 for i in range(numRanks):
139 for j in range(numRanks):
--> 140 frac = self.getAmbigFrac(ranks[i], ranks[j], i>j)
141 hexcolor = '#%02x%02x%02x' % (255*(1-frac), 255, 255*(1-frac))
142 result += '<rect x="' + str(i*20) + '" y="' + str(j*20) + '" width="20" height="20" fill="' + hexcolor +'"><rect>'
<ipython-input-23-be2423c192a7> in getAmbigFrac(self, rank1, rank2, suited)
130 continue
131 nHand += 1 # short for nHand + 1
--> 132 nfrac += self.getFrac(pe.string2card([card1, card2]))
133 return nfrac / nHand
134
UnboundLocalError: local variable 'nfrac' referenced before assignment
If someone could help with this that would be great!
<ipython-input-293-f7be4a953932> in setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain)
46 totalNumHandsInRange = 0
47 for iChild in tree.children[iDecPt]:
---> 48 comboCounts[iChild] = strats.ranges[iChild].getNumHandsWithoutConflicts([i,j])
49 totalNumHandsInRange += comboCounts[iChild]
50 strats.evs[hero][iDecPt][i][j] = 0
IndexError: list assignment index out of range
setMaxExplEvsAtVillainDP
def setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain):
for iChild in tree.children[iDecPt]:
setMaxExplEvsHelper(tree, iChild, strats, hero, villain)
for i in range(0, numCards):
for j in range(i+1, numCards):
comboCounts = []
totalNumHandsInRange = 0
for iChild in tree.children[iDecPt]:
comboCounts[iChild] = strats.ranges[iChild].getNumHandsWithoutConflicts([i,j])
totalNumHandsInRange += comboCounts[iChild]
strats.evs[hero][iDecPt][i][j] = 0
for iChild in tree.children[iDecPt]:
strats.evs[hero][iDecPt][i][j] += strats.evs[hero][iChild][i][j] * (comboCounts[iChild] / totalHandsInRange)
My code-file https://dropmefiles.com/mvXLP (password to file is full name of file in folder(Solving poker) with .npy expansion)
I spend a lot of time trying to get the syntaxes right in my first work-through of this videoseries. Did not know much about python before. After some time I got completely stuck around video 5 or 6.
Then I discovered this: https://www.coursera.org/course/pythonlearn, which is a completely free basic introduction to python. Since I started that course, I have much less problems finding my typos etc. in the code.
Those solutions were found with a C++ program I wrote. It uses essentially the same algorithm described in these videos but has some more optimizations and features. The code isn't available.
thanks, already taken care of
Great, glad you found the issue :)
hi there.
i just tried to add 2 functions to the range class, with the goal of being able to remove certain hands from a range.
first, i wanted to be able to enter the hands as a list of strings (i.e. "AKo, 77, 58s", etc.)
so, to convert the list of string to a cardslist containig numbers, i added a modified version of the setRangeString function inside the range class, which looks like that:
# inputs: rangeString - string containing comma-separated terms of the form XX, XY, XYs, XYo, XaYb
# outputs: range as a list of numbers
def convRangeToCardslist(self, rangeString):
result = [] <--- added
handStrs = rangeString.replace(' ','').split(',')
for hand in handStrs:
if len(hand) == 2:
rank1 = hand[0]
rank2 = hand[1]
for i in suits:
for j in suits:
if rank1 == rank2 and i == j: # avoid stuff like 2c2c
continue
result.append(pe.string2card([rank1 + i, rank2 + j])) # <--- I just changed these lines
elif len(hand) == 3:
rank1 = hand[0]
rank2 = hand[1]
if hand[2] == 's': # suited hands
for s in suits:
result.append(pe.string2card([rank1+s, rank2+s])) # <----
else: # offsuit hands
for i in range(numSuits):
for j in range(i+1, numSuits):
result.append(pe.string2card([rank1+suits[i], rank2+suits[j]])) # <----
elif len(hand) == 4:
card1 = hand[0:2]
card2 = hand[2:4]
result.append(pe.string2card([card1, card2])) # <----
else:
print("ERROR!")
return result <--- added
and then i wrote another additional function inside the range class
# inputs: rangeString - string containing comma-separated terms of the form XX, XY, XYs, XYo, XaYb
# outputs: n/a
# side-effects: removes hands from range that are given in rangeString
def removeSelectedHands(self, rangeString):
selHands = self.convRangeToCardslist(rangeString)
zeroHandsWithConflicts(self.r, selHands)
in which i am making use of the already existing function zeroHandsWithConflicts, which should set the cards contained in the variable selHands to 0 and therefore exclude from the range.
the positive thing: i am not getting any errors.
but when i consecutively build a range, like:
bbStartingRange = Range()
bbStartingRange.setToTop(0.65, pe.string2card(['__', '__', '__', '__', '__']))
bbStartingRange.removeSelectedHands("KQ")
the KQ hands in this example won´t be excluded from the range, which is being displayed as if the function weren´t there.
thanks for any help
In the above function, for the case XYo, I don't agree with:
else: # offsuit hands
for i in range(numSuits):
for j in range(i+1, numSuits):
as for a hand like 78o, the range is only half of all combos as indicated by light green. 7h8c will be included but not 7c8h... instead we want to avoid same suit as 7h8h, so I changed it to:
else: # offsuit hands
for i in range(numSuits):
for j in range(numSuits):
if i != j:
... ...
Great video, impressive to code and comment so clearly at the same time with so little errors!
Thanks, glad you enjoyed the vids. Yea, just a reminder that a couple issues (such at the one mentioned here) are corrected in the errata section at the top of this page.
Hey, a couple things --
In convRangeToCardslist, it looks like the return statement is indented such that it's inside the for loop. So, the body of the for loop will only execute once before the return. I dont think this is what you want.
Second, the zeroHandsWithConflicts function expects as input a list of cards, but you're giving it a list of lists of cards. Anyhow, that function's purpose is to remove all hands that have any individual cards in common with the given cards, which is probably not what you want.
Anyhow, if I understand what you're trying to do correctly, I think the code we wrote already supports it. Our setRangeString(self, rangeString, value) function sets the fractions of all hands referenced by rangeString to value. If you want to remove the hands, use value=0. So you could say,
bbStartingRange.setRangeString("KQ",0)
and that should do it.
thanks, will. setting the value to 0 must have been to obvious for me :)
again: good video series. it´s worth every penny.
:)
# hey will,
# just wanna give a quick review (its very unique that i give reviews)
# today i finished all the material, took me three days (with some background basics in programming)
# i really liked the series, although it gave me some headaches, finding all the mistakes that i have made
# at the end you guys will have a pretty strong tool for finding optimal strategies in every (manageable) subtree
# a very nice side-effect is that you learn python (don't be afraid, everything will be described very well)
# will, could you give some notes on adding another player ('BU') ? i think holdem-resources uses FP for 3+-handed trees as well. do you have some ideas for that task?
# p.s. your youtube-promo brought me here, so keep it up!
Hey Fridayy,
I'm glad you enjoyed the videos :). I've never written a 3+ player solver myself, since I'm really just interested in HU play, but I believe you're right about holdemresources. So, doing FP for 3 players seems like a very straightforward generalization of the 2-player case, at least conceptually. You're still just solving for the players' max expl strats in sequence, and you just add another player in the rotation. I don't see any real difficulties extending our approach to equity calcs, decision trees, etc. I'm not completely sure that FP is guaranteed to converge in the 3+ player case, but it probably is.
All that said, there are some significant theoretical issues with the idea of "GTO" play in 3+ player games. In heads up, playing an equilibrium strategy gets you a pretty strong guarantee -- non-negative expected profit (on average over both positions and neglecting rake). That's not true in 3+ player games -- you can play equilibrium strategies in all positions and still lose money. Also, in the 3+player, there could be multiple equilibria with different values -- some could be better for you than others. This is also not true heads up.
Overall, it makes a lot less sense to try to play equilibrium strategies in the 3+ player case, imo, despite the fact that lots of people like to talk about it these days...
Will
hi will, again!
you are right, there are multiple equilibria in 3+ player games, and yes, some are better than others. i read some papers about that theme and started with a simplified poker-variant named kuhn poker. the two player case worked pretty well and it was no problem after watching your series. however, the three player case seemed harder, actually i had/have problems with equity calculations for the three player case and card removal effects. apart from this the soluton i got is pretty okay and near to the solution i had found in the papers. now i want to solve real poker for three player. at first i need the equity array, but this time for three hands. you said that even for two players this takes two months to calculate, so i think for three players this will be even longer. do you have any ideas to get over this? is there any person who had done this before?
regards, fri
You want to calculate 3-player equity arrays? Not something I've done before, but here are a few ideas:
- Python can be kind of slow for various reasons. For our EA generation algorithm, I'd imagine you'd get a speedup of about 10x by just writing essentially the same code in C/C++.
- Take advantage of some symmetries. We were really dumb in our approach (because it doesn't really matter) -- I want to say we calculated the equity of QsJs vs 8h7h separately from the equity of 8h7h vs QsJs. So that's a factor of 2 we could cut, and it's a factor of 6 in the 3-player case. And there's tons of other stuff like that. No need to calculate the equity of 8h7h and 8s7s differently vs anything on a board w/ no hearts or spades flushdraws possible.
I believe PokerStove was recently open sourced, and I believe it did a lot of these sorts of things to get fast multiplayer range-vs-range equity calcs. Check it out --
https://github.com/andrewprock/pokerstove/tree/master/src/lib/pokerstove...
hi will
in the video you said that the preflop array will take several weeks to be generated. how did you solve this? did you have access to a large computer array?
also: for me the next step would be to rewrite the code for omaha. but it seems utterly impossible because of the size of the arrays, since there are 270k possible 4 card holdings (now square that for hand vs hand matchups). do you think it could be managable somehow?
cheers
s.
Hey,
The version of the preflop equity array that I distributed came from just leaving it running for a few weeks. That's how I know how long it took :p.
Yea, everything's going to be a lot bigger/slower in omaha. However, you can gain a lot over the naive approach by taking advantage of some symmetries. In our equity calcs (both in generating the equity arrays and later when we use the equity arrays) we find or lookup the equity of every hand combo vs every other. I.e. we find the equity of AhKs vs Qd8d separately from the equity of AcKs vs Qd8d, etc. If you're smart about it, you can avoid a lot of extra work that way.
Similarly with ranges -- if you take the same simple approach to storing a range, it'll take tons of memory since there are so many hand combos. Well, in holdem, preflop, we could have just stored one number for, e.g., AKo, instead of 12 separate numbers for AhKs, AhKd, etc. And then even postflop, a lot of combos are still equivalent or mostly equilvalent. E.g., if the flop is rainbow, then all your AKo combos are the same (except maybe with different blockers to Villain's backdoor flushdraws). So you could probably get away with just using one set of frequencies to describe AKo's play on that flop instead of 12.
I think the potential savings from this sort of thing are quite a bit bigger in omaha than holdem, although taking advantage of them definitely makes things more complicated.
Good luck!
Will
Just want to get a clear picture of what is in the videos . I looked at the sample videos already .
What other situations are covered besides the shove fold game ?
What does the code do for you on flop and turn situations for example ?
I am a programmer already so not concerned about that aspect
Just what the code does exactly in other situations besides the shove-fold game .
Thank you .
Hey,
I don't solve a wide variety of games on video (just another preflop-only game and then a turn and river situation), but the code lets you set up and solve arbitrary decision trees with arbitrary starting ranges. If you've not read the books or want more info, please lmk, and I can say more.
Cheers,
Will
I am not sure . Let's say someone bets on the turn and I need to decide whether to call , raise or fold .
It can help me decide this ? How long does it take to calculate approximately ? I've read your first book so I understand the equilibration thing .
Also , how often can you expect to win a heads up SNG using optimal strategy against the typical opponent ( not very strong let's say ) ? I was told that the best that can be done is 55-60% . Is that correct ?
yes, you can build a model to approximate a solution. you have to keep in mind though that the more decisions players can potentially make, the more difficult it is to build a model. for example: PF shove/ fold is solved. if players only have the choice to either shove/fold and call/fold, the solutions are simple to calculate and they will be precise. if it´s a two street model with plenty of chips behind, it can get complicated. you have to simplify the model to get a result out of it. i just did a two street solution with 101 nodes for the decision tree and it took the computer to calculate the equilibrium about 30 minutes (i didn´t stop the time, it´s just a wild guess actually). that´s an ok time for such a complex simulation i think.
cheers
s.
Yup, I agree with all of this.
Regarding good winrates, it depends on the HUSNG format -- something like 53% would be very good at hypers. At regspeeds, maybe more like 58% would be possible.
Hi will. tried to copy the question i made in 2+2 forum but got blocked by the spam filter. I´ll post the link
http://forumserver.twoplustwo.com/showpost.php?p=44328439&postcount=195
Regards
I am going away for a month. I can de-register on my PC then reregister on another computer when I am away to watch the videos. Then I can do the same on the way back. is that correct?
Hi genher,
Yes, that would work. You can deactivate / "de-register" a given key for the pack on your computer, activate / "re-register" on another computer while you are travelling, then deactivate on that computer before you depart to return home, and activate again on your home computer upon your return. There are a few things you should keep in mind if you choose to pursue this:
•There is only a limited number of deactivations available per key: 10. If you use up the available deactivations, you will be stuck with no way to deactivate. Therefore, our advice is that you exercise prudence in deciding when and where your deactivations are worth it.
•It is important that you don't forget to deactivate on the other computer before you leave for home. There is no way for us to deactivate a given key on a given computer remotely.
•Keep in mind that each key is good for two concurent activations. If you have only activated your key on one computer so far, you don't need to deactivate on that computer in order to activate it on up to one other computer. This might save you a deactivation. You will still need to deactivate on the other computer when you leave that computer (if that other computer is not a computer you plan to use again later).
I hope that answers your questions. Please let us know if you have any others.
st
all good, thanks for the answer
Hi guys,
I am really enjoying the videos but is there not a place to download an actual completed notebook?
I have had to start over twice now but have an issue getting the range class to print nicely. Spending days looking for a typo is not going to work for me.
Hey, we made a decision to provide a video series and not a finalized piece of software intentionally. Debugging is a part of every nontrivial software project -- it's part of the experience. If some of your code doesn't work while you're going through this video series, and you have to set out to fix it, please don't think of that as an unfortunate detour. On the contrary, it's an important and valuable part of the process. If you just copy down everything I write, and it works perfectly, you haven't gotten your money's worth :).
Also, simply comparing your code to that in the video is not the most efficient way to debug. I'm going to put some thoughts on debugging methodology here:
http://willtipton.com/coding/2014/10/05/debugging.html
Please give that a shot, and if you're still stuck, use the instructions there about the most useful information to provide when asking for help.
Hi,
the quality if the sound at video 5 (eq array) is pretty bad, can you maybe do something with it?
I have a question too:
About getEquityVsHand() and getEquityVsHandFast(), I almost always get same results to the %timeit
example:
%timeit getEquityVsHand(hand, villainHand, myEArray)
%timeit getEquityVsHandFast(hand, villainHand, myEArray)
results:
1000000 loops, best of 3: 912 ns per loop
1000000 loops, best of 3: 896 ns per loop
everytime both results are around ~1 microsec, sometimes the second function takes even longer then the first one, but i should do way better, right?
What i dont understand, in your video (at the end of fifth) the first result takes only 100000 loops, not 1000000 like the second one.
thank you
(sorry for my english)
Edit: solved, my fault obv. :)
-e-
Hi,
I enjoyed this video series very much, thank you.
I've seen some mention of solving poker with linear programming, don't remember if it was here or somewhere else. Could you say a few things on how people are using linear programming to solve poker? I'm quite familiar with linear programming and discrete optimization, but I've never applied it to poker problems specifically. I don't think it's immediately obvious how it would improve the solvers -- I suppose I could see it working to bound the search space in a branch and bound algorithm. But intuitively, FP should be doing reasonably well, and I'd think that equity calculations quickly end up being a major bottleneck anyway. So how would we benefit from LP?
Thank you.
Ok, never mind. I found the answer through other channels.
Hey, great video series, I´m halfway through and will give some more feedback later on.
Short question: is there any "convenient" way of getting the pokereval-lib to run on a Mac? I read that somebody else managed it ... but he didn´t say, how he made that?
Thanks!
See http://forumserver.twoplustwo.com/showpost.php?p=44349749&postcount=205
...
Hi,
The purpose of the pack is to code the program yourself, you'll learn a lot along the way. The pack is designed to make it quite doable though, many people have done it already and found the value in it (feel free to chime in about this please buyers).
First of all - THANK YOU SO MUCH, Will!! This is simply a masterpiece of work - and I enjoyed every single bit of it.
I have some programming background (I´ve been pretty busy in programming the old days, with all the "old" stuff, like Pascal, Java (beginning with 1.0) and C/C++. I´ve never put my hands on Python though, I did not even know anything about it, just the name I might´ve heard somewhere ...
After Will had introduced the Range-class, I wrote a method that can read / interpret range strings like "22+, KTo+, AJs+, 9h8h, 87s" and transform a given range-class back to that short string representation as well.
After the 6th video (iirc), I´ve already programmed a function that automatically generates Decision Trees, based on given parameters, like street, pot/cip-sizes, betsizes, possibility to call or not and such, so I can pre-compute any possible game tree - and then fine tune it to get the special situation I´d like to work with (by removing nodes, copying complete subtrees and such). The possibility of displaying the tree in different schemes was just the creamtopping:
http://i.imgur.com/lDGAZuh.png
http://i.imgur.com/eXSJaSO.png
http://i.imgur.com/sjjJo0F.png
http://i.imgur.com/GIJPObQ.png
Additionally I programmed a routine that creates EquityArrays on multiple cpu's, so I improved the creation time of new EquityArrays by factor 4+.
Generally I always tried to be at least one video ahead and avoid to just get spoonfed. That said, I solved the shove/fold-game like two videos before Will gave the solution. I programmed the ficticious play routine entirely independent and only took a look at the video to see if there´s sth. I´ve might been missing.
Will is just a great explanator, he explains things just in the right speed, doesn´t bore you with basic stuff that you can work off the videos on your own way better, he shows you the absolute cornerstones of what he wants to get into our heads - and as you can see, it worked great. Admittedly, my programming background probably made things way easier, but that should be clear. If you´ve never programmed before, you´ll have a way harder path ahead of you, but still, the simplicity of iPythong along with the "simplicity" of Will's explanation will get you on track pretty soon - if you´re serious about it.
The form in which the videos are created is just flabbergasting. Will makes it look like as if he writes the programs from scratch - which is ovbiously impossible, he has precoded / tested them and then re-wrote in the videos, but that way is just a great way to get us along the path without getting lost in the wide universe of developing code from scratch with all the basic preparation / dead ends, re-implementations etc.
Finally the price - I promise you, it´s a steal. Had I known what I´m finally getting, I easily had paid like 5x as much. If you´re on the fence if or if not to buy this video pack - do it. It´s hard to imagine you´ll get disappointed if you made it that far.
Will, thanks again for the great work!
You earned a big fan and I´m so happy and excited about the literally unlimited possibilities that wait for me now! :-)
Yours, Tackleberry
Any suggestions on how to optimize the number of iterations in the doFP function ? I've been trying to figure out how to tell when the next iteration is not that
different from the previous one but I don't think that will work as the changes are gradual from one iteration to the next but the 100th iteration is drastically different from the 50th .
Is there anyway to know when it has gone as far as needed ?
I´m currently working with "max nash distance". By that I mean I calculate the difference between the max. expl. strategy against opponent's strategy and the currently realized EV. Then I take the max. absolute difference from both players. If that passes a defined treshold, I stop the iteration.
Example (pseudo code):
pot = potsize at root decPoint
curStratHero = current strategy of Hero
curEVHero = EV (at root decPoint) of curStratHero vs. curStratVillain
maxExplStratHero = max. expl. strat of Hero vs. curStratVillain
maxEVHero = EV (at root decPoint) of maxEplStratHero vs. curStratVillain
nashDistHero = maxEVHero - curEVHero
<repeat for Villain>
maxNashDist = max(nashDistHero, nashDistVilain)
if ((maxNashDist / pot) < 0.05):
# Nash Distance sufficiently low (below 5% of starting pot)
break
Got clear what I mean?
Is curStratHero the maxExplStratHero from the previous iteration ? If so , this is what I was trying until I realized it doesn't do much . There is not much difference between iterations and the EVs keep changing from one iteration to next . It takes many iterations for a big change to occur . For example let's say EV = 50.55 current iteration . Next iteration it is 50.50 so by your criteria we are done . But how do you know that in 100 more iterations , EV won't be 45.50 ? How do we know 0.05 is a good threshold ?
maxNashDist doesn't seem to be converging to a small number for me . If it is , then it is taking many 1000s of iterations and I am hoping for a much smaller number than that .
OK , if I look at the root decision point only , I do see some good convergence . After 100 iterations , the EVs are differing only by .001 . But other nodes are not converging even after 500 iterations . Looking at the dump at 100 iterations and the dump at 200 , there is still quite a bit of difference in the dumps . So I am still not clear on what a good threshold is .
If I look at the dumps , it seems some hands are just slowly converging to 0% of the range but I can't know for sure until I do 10000 iterations .
If a hand is going to be 20% of my range or 0% of my range , I would think that is a big difference in my overall win rate . But doing 1000s of iterations is just not practical to find out .
Mmmh ... not sure, what you mean? If we define a "strategy" as the sum of all decision on any nodes, why are you interested in any other node than the root DecPoint? Like, say we are on the turn, you have a range X. You can bet, x/c, x/f, x/r, and depending on what you do, there'll be a river, where you have even more options. Your goal should be to maximize your overall EV at the point of decision - namely the turn, on your very first decision. It does not matter, wether you have a giant +EV spot later on, if you buy that with massively -EV-moves on other nodes (resulting in lower overall EV). Clear, what I mean?
That way, I´d say, the only real information about how far we got should be the root DecPoint.
No, you likely got me wrong. The maxExplStrat is the CURRENT max. exploitive strategy against Villain's currently chosen strategy (= ranges on all nodes). This is likely not what we're doing though, we just merged a part of that strategy into our current strategy.
Example: I have {AA, 32s} on 77766 river. You have {99}. I shove potsize. Nash strategy for both players would be {AA, 32s[75%]} for me and {99[50%]} for you.
Now, suppose, my current range (at iteration X) would be to shove everything, your current strategy would be to call 60% (of your 99). In that case, our current EVs would be:
EV Hero: (0.4 * p) + (0.6 * (0.6 * 3p - p)) = 0.88p
EV Villain: 0.6 * 0.4 * 3p - p = 0.12p
That would be our current EV. Now, what´s our maxEV against each player's current strategy?
If you call 60%, I should never bluff, so I should actually only bet AA and fold all 32s. That means, my maxExplEV is:
maxExplEV Hero: 0.6*0.6*3p = 1.08p => My distance between current strategy's EV (0.88p) and maxExplEV = 1.08 - 0.88 = 0.2p.
What's your difference? I´m bluffing too much, so your maxExpl strategy should be to call 100%:
maxExplEV Villain: 1.0*(0.4*3p - p) = 0.2p => Your distance between maxExpl and current EV is 0.2p - 0.12p = 0.08p.
Got clear now?
Well , if I am going to understand what you are saying it is going to take me long study . But it is an interesting point you make that the root node is what counts .
Will , is there anything you can add to this discussion ? Do you concur with Tackleberry's conclusions ?
How do you obtain "curStratHero" and "curStratVillain" from Will's code ? from the "strats" object ( before the call to "updateRanges" ) ?
How do you obtain "maxExplStratHero" ... from the sbMaxEVStrat and bbMaxEVStrat objects ?
OK , I think I understand your pseudocode now ( I think ) and will try it out . In the meantime , any additional comments appreciated .
OK I am only getting 8 iterations ( that would be great , if it is correct ) ...
So not sure if I am doing this right :
1. I take HERO = SB and Villain = BB .
2. I get SB's current EV from the evs["SB"][0] of the strats object . Similarly for BB . ( I sum over the evs and average it )
3. after the call to setMaxExplEVs("SB","BB") I get SB's max EV from the evs["SB"] [0] of the strats object .
4. after the call to setMaxExplEVs("BB","SB") I get BB's max EV from the evs["BB"] [0] of the strats object.
Is this correct ?
Hi all,
I recently purchased this pack - awesome pack by the way - and have encountered the following error whilst on video 7 Range Class, trying to display the range:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-24-ec23ba4cad64> in <module>()
1 bob = Range()
----> 2 bob._repr_svg_()
<ipython-input-23-be2423c192a7> in _repr_svg_(self)
138 for i in range(numRanks):
139 for j in range(numRanks):
--> 140 frac = self.getAmbigFrac(ranks[i], ranks[j], i>j)
141 hexcolor = '#%02x%02x%02x' % (255*(1-frac), 255, 255*(1-frac))
142 result += '<rect x="' + str(i*20) + '" y="' + str(j*20) + '" width="20" height="20" fill="' + hexcolor +'"><rect>'
<ipython-input-23-be2423c192a7> in getAmbigFrac(self, rank1, rank2, suited)
130 continue
131 nHand += 1 # short for nHand + 1
--> 132 nfrac += self.getFrac(pe.string2card([card1, card2]))
133 return nfrac / nHand
134
UnboundLocalError: local variable 'nfrac' referenced before assignment
If someone could help with this that would be great!
Cheers,
Matt
Hi guys
Does anybody know how solve this problem?
ERROR
IndexError Traceback (most recent call last)
<ipython-input-294-d1f3f692f068> in <module>()
----> 1 soln = doFP(minrShoveTree, 5)
2 soln.dump()
<ipython-input-291-c2dfadcc46bb> in doFP(tree, nIter, sbStartingRange, bbStartingRange)
6 print i
7
----> 8 setMaxExplEvs(tree, strats, "SB", "BB")
9 sbMaxEvStrat = getMaxEvStrat(tree, "SB", strats)
10 strats.updateRanges("SB", sbMaxEvStrat, i)
<ipython-input-293-f7be4a953932> in setMaxExplEvs(tree, strats, hero, villain)
1 def setMaxExplEvs(tree, strats, hero, villain):
----> 2 setMaxExplEvsHelper(tree, 0, strats, hero, villain)
3
4
5 def setMaxExplEvsHelper(tree, iDecPt, strats, hero, villain):
<ipython-input-293-f7be4a953932> in setMaxExplEvsHelper(tree, iDecPt, strats, hero, villain)
8 setMaxExplEvsAtLeaf(tree, iDecPt, strats, hero, villain)
9 elif (currDecPt.player == hero):
---> 10 setMaxExplEvsAtHeroDP(tree, iDecPt, strats, hero, villain)
11 elif (currDecPt.player == villain):
12 setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain)
<ipython-input-293-f7be4a953932> in setMaxExplEvsAtHeroDP(tree, iDecPt, strats, hero, villain)
35 strats.evs[hero][iDecPt] = numpy.zeros_like(strats.evs[hero][iDecPt])
36 for iChild in tree.children[iDecPt]:
---> 37 setMaxExplEvsHelper(tree, iChild, strats, hero, villain)
38 strats.evs[hero][iDecPt] = numpy.maximum(strats.evs[hero][iDecPt], strats.evs[hero][iChild])
39
<ipython-input-293-f7be4a953932> in setMaxExplEvsHelper(tree, iDecPt, strats, hero, villain)
10 setMaxExplEvsAtHeroDP(tree, iDecPt, strats, hero, villain)
11 elif (currDecPt.player == villain):
---> 12 setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain)
13 else:
14 setMaxExplEvsAtNatureDP(tree, iDecPt, strats, hero, villain)
<ipython-input-293-f7be4a953932> in setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain)
46 totalNumHandsInRange = 0
47 for iChild in tree.children[iDecPt]:
---> 48 comboCounts[iChild] = strats.ranges[iChild].getNumHandsWithoutConflicts([i,j])
49 totalNumHandsInRange += comboCounts[iChild]
50 strats.evs[hero][iDecPt][i][j] = 0
IndexError: list assignment index out of range
setMaxExplEvsAtVillainDP
def setMaxExplEvsAtVillainDP(tree, iDecPt, strats, hero, villain):
for iChild in tree.children[iDecPt]:
setMaxExplEvsHelper(tree, iChild, strats, hero, villain)
for i in range(0, numCards):
for j in range(i+1, numCards):
comboCounts = []
totalNumHandsInRange = 0
for iChild in tree.children[iDecPt]:
comboCounts[iChild] = strats.ranges[iChild].getNumHandsWithoutConflicts([i,j])
totalNumHandsInRange += comboCounts[iChild]
strats.evs[hero][iDecPt][i][j] = 0
for iChild in tree.children[iDecPt]:
strats.evs[hero][iDecPt][i][j] += strats.evs[hero][iChild][i][j] * (comboCounts[iChild] / totalHandsInRange)
My code-file https://dropmefiles.com/mvXLP (password to file is full name of file in folder(Solving poker) with .npy expansion)
Hey, looks like you have a typo. The accumulator variable nfrac should be nFrac -- capitalization matters!
Hi all.
I spend a lot of time trying to get the syntaxes right in my first work-through of this videoseries. Did not know much about python before. After some time I got completely stuck around video 5 or 6.
Then I discovered this: https://www.coursera.org/course/pythonlearn, which is a completely free basic introduction to python. Since I started that course, I have much less problems finding my typos etc. in the code.
The course comes with a free python book: http://www.pythonlearn.com/book.php
I believe that the time spent on the python course actually saved me time working my way through the videopack - so highly recommended.
Hope this helps someone else.
Thanks for sharing this :).
the figure 15,7 pseudo equilibirum preflop ranges at 20 BB deep ini tiptons second book. is this made with python and is the code availbe?
regards
Those solutions were found with a C++ program I wrote. It uses essentially the same algorithm described in these videos but has some more optimizations and features. The code isn't available.
Pages