Value Function

about | blog | config | notes | github

The value function is a predicition of total future Reward Signals. This function is used evaluate how good or bad a state in Environment is. It be defined as,

\[v_{\pi}(s) = \mathbb{E}_{\pi}[R_{t+1} + \gamma R_{t+2} + \gamma^2 R_{t+3} + \dots | S_t = s]\]

Created: 2021-11-13

Emacs 26.1 (Org mode 9.5)