Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger
(Microsoft, McGill University)
AAAI 2018
Presented by Zachary Sunberg, August 26, 2021
Why not?
Note: would expect more in a normal presentation
Half Cheetah
Hopper
Algorithms:
Not reported, 2 million timesteps in learning curves
"simply multiplying the rewards generated from an environment by some scalar"
"Unfortunately, in recent reported results, it is not uncommon for the top-N trials to be selected from among several trials (Wu et al. 2017; Mnih et al. 2016)"
Without the publication of implementations and related details, wasted effort on reproducing state-of-the-art works will plague the community and slow down progress.
Positive:
Negative:
Over 1000 citations, including