AutoCFR
Xu et al. 2022
- Review of Imperfect Information Games
- Review of CFR
- Non Monte Carlo CFR variants
- Regularized Evolution
- AutoCFR
Imperfect Information Extensive Game Formalism
Extensive Form Games
- Reasoning about action histories
R
P
S
Player 1
Player 2
R
P
S
R
P
S
R
P
S
R
P
S
Player 1
Player 2
(0,0)
(0,0)
(0,0)
(-1,1)
(-1,1)
(-1,1)
(1,-1)
(1,-1)
(1,-1)
Extensive Form Games
- Reasoning about action histories
Imperfect Information Extensive Form Games
- Reasoning about information sets
R
P
S
R
P
S
R
P
S
R
P
S
Player 1
Player 2
(0,0)
(0,0)
(0,0)
(-1,1)
(-1,1)
(-1,1)
(1,-1)
(1,-1)
(1,-1)
Counterfactual Regret Minimization
Subgame utility evaluation
Counterfactual Regret Minimization
CFR Variants
(non Monte Carlo)
Tightest known convergence bound for CFR+ is 2x worse than vanilla CFR
(but in practice CFR+ converges significantly faster)
CFR+
DCFR
Regularized Evolution
(A solution to graduate student descent)
Sample subset of population
Random architecture population initialization
Cull the old
Mutate best, evaluate, store
AutoCFR
Back to simulink...
CFR
CFR+
DCFR
Evaluation
Clipped, normalized exploitability
Algorithm
Initialize random (or bootstrapped) population
Sample from subset of population
Mutate & ensure candidate is reasonable contender
Evaluate mutant & store
Algorithm
Intermediate Policy Improvement
Intermediate Policy Evaluation
Final Policy Evaluation
DDCFR
Double-clipped Discounted CFR
Results
Training
Testing
AutoCFR
By Tyler Becker
AutoCFR
- 255