Algorithms for Unexploitable Hypersonic Weapon Defense

Professor Zachary Sunberg

July 13th, 2022

BLUF

The maneuverability of hypersonic missiles fundamentally changes their ability to overcome defenses.
Game theory (GT) is a mathematical framework for reasoning about interactions.
- If there is partial observability, the best strategy may be stochastic (i.e. bluffing).
We need to outsmart opponents with hypersonic weapons in 3 domains:
- Onboard vehicle control (D1)
- Battlespace management (D2)
- Arsenal design (D3)
Existing GT approaches have limitations for hypersonic missile defense
We can develop new mathematical tools and algorithms for hypersonic missile defense
- Double Oracle with POMDPs
  - Differential games (for D1)
  - Tree search for (for D2)
- Scenario-based GT planning (for D2)

Defending against Maneuverable Hypersonic Weapons: the Challenge

Ballistic

Maneuverable Hypersonic

Sense
Estimate
Intercept

Every maneuver involves tradeoffs

Energy
Targets
Intentions

Game Theory

Nash Equilibrium: All players play a best response.

Optimization Problem

$\text{subject to} \quad g(x) \geq 0$

$\text{maximize} \quad f(x)$

Game

Player 1: $U_1 (a_1, a_2)$

Player 2: $U_2 (a_1, a_2)$

Collision

Example: Airborne Collision Avoidance

Player 1

Player 2

Down

-6, -6

-1, 1

1, -1

-4, -4

Collision

Mixed Strategies

Nash Equilibrium $\iff$ Zero Exploitability

\[\sum_i \max_{\pi_i'} U_i(\pi_i', \pi_{-i})\]

No Pure Nash Equilibrium!

Instead, there is a Mixed Nash where each player plays up or down with 50% probability.

If either player plays up or down more than 50% of the time, their strategy can be exploited.

Exploitability (zero sum):

Hypersonic Missile Defense (simplified)

Attacker

Defender

Down

-1, 1

1, -1

-1, 1

Collision

Strategy ($\pi_i$): probability distribution over actions

Three Domains for GT in Hypersonic Defense

Domain 1: Onboard Vehicle Control

Domain 2: Battlespace Management

Domain 3: Arsenal Design

Existing Tools

Differential Games

High precision control
Assumes full observability

Deep Reinforcement Learning

Image: DeepMind

Superhuman Performance
Solves Biggest Problems
Slow Offline Training

Incomplete Information Game Theory

Mixed Strategies/Bluffing
Discrete Board-game-like Problems
Computational Load

*Partially observable Markov decision process

POMDP* Planning

Partial Observability
Single Player (just an optimization problem)

Innovation 1: Double Oracle (DO) with POMDPs

Default Strategy Profile

Mix Pure Strategies

Add Best Response to Pure Strategy Set

Compute Best Response

(Solve POMDP)

[McAleer et al., 2022]

\[\hat{x}', \Sigma' = \text{Filter}(\hat{x}, \Sigma, y, \pi_1)\]

\[u = \tilde{u} -K(\hat{x} - \tilde{x})\]

DO Applied to Domain 1: Online Differential Games

[Peters, Sunberg, et al. 2020]

2 ms solve time with warm starting

DO Applied to Domain 1: Online Differential Games

[Peters et al. 2022]

< 3 min offline training, 2 ms online computation

DO Applied to Domain 2: Solving POMDPs Online

100-1000 ms online planning

[Sunberg and Kochenderfer, 2018]

[Sunberg and Kochenderfer, 2022]

[Gupta, Hayes, and Sunberg, 2022]

[Lim, Sunberg, et al. (in prep)]

Innovation 2: Scenario-Based GT Planning

Issue: Current counterfactual regret (CFR) techniques have large computation requirements because of high variance due to random sampling

Solution: Scenario-based GT planning (idea borrowed from POMDP research)

[Somani et al. 2017]

[Garg et al. 2019]

What We Need

Funding
- Students
  (~$100k per year incl. overhead)
- Faculty and Staff
- Computing Resources

Guidance/representative models
- Unclassified for most work
- Pipeline to high-fidelity simulation

Notional timeline (2 Faculty, 3 Students):

Apply existing diff. game approaches

Develop and implement feedback diff. game algorithms

Teach advanced dynamic GT class

Months:

Test in high fidelity simulation

Diff. games for onboard control

POMDP DO for battlespace mgmt

Scenario-based GT Planning

implement on representative PO Markov game

Develop decentralized composable algorithm

Other

Organize workshop at conference

Develop and imlement basic algorithm

Test in realistic simulation

Develop Mathematical Theory

Test in realistic simulation

Deliverables: Reference implementations

Hybrid Games

Student Internship at contractor

Additional Material

Our Expertise

Game Theory for Space Domain Awareness

Our Expertise

Open Source Julia Software

POMDPs.jl - An interface for defining and solving MDPs and POMDPs

Executes fast as C
As easy to write and read as Python/MATLAB

Our Technical Approach

Partially Observable Markov Games
Principled Tractable Approximations
Online Solutions First
Focus on Composability
Fast Prototyping in Julia

Research Model

Mathematical Frameworks

Algorithm Development

Realistic Simulation

Physical Tests

Proposed Research

Develop

Mathematical Frameworks
Algorithms

to compute unexploitable strategies for hypersonic weapon defense systems

Probability distributions describing what all players should do

Game Theory

Nash Equilibrium: All players play a best response.

Example: Stag Hunt

Optimization Problem

$\text{subject to} \quad g(x) \geq 0$

$\text{maximize} \quad f(x)$

Game

Player 1: $U_1 (a_1, a_2)$

Player 2: $U_2 (a_1, a_2)$

Strategy ($\pi_i$): probability distribution over actions

Player 1

Player 2

Stag

Hare

Stag

Hare

10, 10

1, 8

5, 5

8, 1

Algorithms for Unexploitable Hypersonic Weapon Defense

BLUF

Defending against Maneuverable Hypersonic Weapons: the Challenge

Ballistic

Maneuverable Hypersonic

Game Theory

Mixed Strategies

Three Domains for GT in Hypersonic Defense

Existing Tools

Innovation 1: Double Oracle (DO) with POMDPs

DO Applied to Domain 1: Online Differential Games

DO Applied to Domain 1: Online Differential Games

DO Applied to Domain 2: Solving POMDPs Online

Innovation 2: Scenario-Based GT Planning

What We Need

Additional Material

Our Expertise

Game Theory for Space Domain Awareness

Our Expertise

Open Source Julia Software

Our Technical Approach

Research Model

Proposed Research

Game Theory

Thank You

Unexploitable Hypersonic Weapon Defense

More from Zachary Sunberg