Algorithms for Unexploitable Hypersonic Weapon Defense
Professor Zachary Sunberg
July 13th, 2022
BLUF
- The maneuverability of hypersonic missiles fundamentally changes their ability to overcome defenses.
-
Game theory (GT) is a mathematical framework for reasoning about interactions.
- If there is partial observability, the best strategy may be stochastic (i.e. bluffing).
- We need to outsmart opponents with hypersonic weapons in 3 domains:
- Onboard vehicle control (D1)
- Battlespace management (D2)
- Arsenal design (D3)
- Existing GT approaches have limitations for hypersonic missile defense
- We can develop new mathematical tools and algorithms for hypersonic missile defense
- Double Oracle with POMDPs
- Differential games (for D1)
- Tree search for (for D2)
- Scenario-based GT planning (for D2)
- Double Oracle with POMDPs
Defending against Maneuverable Hypersonic Weapons: the Challenge
Ballistic
Maneuverable Hypersonic
- Sense
- Estimate
- Intercept
Every maneuver involves tradeoffs
- Energy
- Targets
- Intentions
Game Theory
Nash Equilibrium: All players play a best response.
Optimization Problem
\(\text{subject to} \quad g(x) \geq 0\)
\(\text{maximize} \quad f(x)\)
Game
Player 1: \(U_1 (a_1, a_2)\)
Player 2: \(U_2 (a_1, a_2)\)
Collision
Example: Airborne Collision Avoidance
|
|
|
Player 1
Player 2
Up
Down
Up
Down
-6, -6
-1, 1
1, -1
-4, -4
Collision
Mixed Strategies
Nash Equilibrium \(\iff\) Zero Exploitability
\[\sum_i \max_{\pi_i'} U_i(\pi_i', \pi_{-i})\]
No Pure Nash Equilibrium!
Instead, there is a Mixed Nash where each player plays up or down with 50% probability.
If either player plays up or down more than 50% of the time, their strategy can be exploited.
Exploitability (zero sum):
Hypersonic Missile Defense (simplified)
|
|
|
Attacker
Defender
Up
Down
Up
Down
-1, 1
1, -1
1, -1
-1, 1
Collision
Collision
Strategy (\(\pi_i\)): probability distribution over actions
Three Domains for GT in Hypersonic Defense
Domain 1: Onboard Vehicle Control
Domain 2: Battlespace Management
Domain 3: Arsenal Design
Existing Tools
Differential Games
- High precision control
- Assumes full observability
Deep Reinforcement Learning
Image: DeepMind
- Superhuman Performance
- Solves Biggest Problems
- Slow Offline Training
Incomplete Information Game Theory
- Mixed Strategies/Bluffing
- Discrete Board-game-like Problems
- Computational Load
*Partially observable Markov decision process
POMDP* Planning
- Partial Observability
- Single Player (just an optimization problem)
Innovation 1: Double Oracle (DO) with POMDPs
|
||
|
||
|
Default Strategy Profile
Mix Pure Strategies
Add Best Response to Pure Strategy Set
Compute Best Response
(Solve POMDP)
\[\hat{x}', \Sigma' = \text{Filter}(\hat{x}, \Sigma, y, \pi_1)\]
\[u = \tilde{u} -K(\hat{x} - \tilde{x})\]
DO Applied to Domain 1: Online Differential Games
[Peters, Sunberg, et al. 2020]
2 ms solve time with warm starting
DO Applied to Domain 1: Online Differential Games
< 3 min offline training, 2 ms online computation
DO Applied to Domain 2: Solving POMDPs Online
100-1000 ms online planning
[Sunberg and Kochenderfer, 2018]
[Sunberg and Kochenderfer, 2022]
[Gupta, Hayes, and Sunberg, 2022]
[Lim, Sunberg, et al. (in prep)]
Innovation 2: Scenario-Based GT Planning
Issue: Current counterfactual regret (CFR) techniques have large computation requirements because of high variance due to random sampling
Solution: Scenario-based GT planning (idea borrowed from POMDP research)
[Somani et al. 2017]
[Garg et al. 2019]
What We Need
-
Funding
- Students
(~$100k per year incl. overhead) - Faculty and Staff
- Computing Resources
- Students
-
Guidance/representative models
- Unclassified for most work
- Pipeline to high-fidelity simulation
Notional timeline (2 Faculty, 3 Students):
Apply existing diff. game approaches
Develop and implement feedback diff. game algorithms
Teach advanced dynamic GT class
Months:
Test in high fidelity simulation
Diff. games for onboard control
POMDP DO for battlespace mgmt
Scenario-based GT Planning
6
12
18
24
32
36
implement on representative PO Markov game
Develop decentralized composable algorithm
Other
Organize workshop at conference
Develop and imlement basic algorithm
Test in realistic simulation
Develop Mathematical Theory
Test in realistic simulation
Deliverables: Reference implementations
Hybrid Games
Student Internship at contractor
Additional Material
Our Expertise
Game Theory for Space Domain Awareness
Our Expertise
Open Source Julia Software
POMDPs.jl - An interface for defining and solving MDPs and POMDPs
- Executes fast as C
- As easy to write and read as Python/MATLAB
Our Technical Approach
- Partially Observable Markov Games
- Principled Tractable Approximations
- Online Solutions First
- Focus on Composability
- Fast Prototyping in Julia
Research Model
Mathematical Frameworks
Algorithm Development
Realistic Simulation
Physical Tests
Proposed Research
Develop
- Mathematical Frameworks
- Algorithms
to compute unexploitable strategies for hypersonic weapon defense systems
Probability distributions describing what all players should do
Game Theory
Nash Equilibrium: All players play a best response.
Example: Stag Hunt
Optimization Problem
\(\text{subject to} \quad g(x) \geq 0\)
\(\text{maximize} \quad f(x)\)
Game
Player 1: \(U_1 (a_1, a_2)\)
Player 2: \(U_2 (a_1, a_2)\)
Strategy (\(\pi_i\)): probability distribution over actions
|
|
|
Player 1
Player 2
Stag
Hare
Stag
Hare
10, 10
1, 8
5, 5
8, 1
Thank You
Unexploitable Hypersonic Weapon Defense
By Zachary Sunberg
Unexploitable Hypersonic Weapon Defense
- 415