Problem 4
Problem 5
First step for making it fast (in any language, not just julia):
Find out what is slow (by profiling)!
\(U' = B[U]\)
\(B[U](s) = \max_a \underbrace{\left( R(s, a) + \gamma \sum_{s'} T(s' \mid s, a) U(s')\right)}_{Q(s,a)}\)
\(i\) = index of \(s\); \(j\) = index of \(s'\)
Naive implementation:
\(U'[i] = \max_a \left(R[a][i] + \gamma \sum_j T[a][i, j] U[j] \right)\)
\(y = Mx\)
\(y[i] = \sum_j M[i,j] x[i]\)