Environment Model

about | blog | config | notes | github

An environment model is a model that predicts what an RL Environment will do next. As we deal with Markov Decision Processes, we can model the dynamics of the environment my learning the state transitions and reward function. These are formally defined below.

\[P^a_{ss'} = \mathbb{P}[S_{t+1} = s' | S_t = a, A_t = a]\]

\[R^a_s = \mathbb{E}[R_{t+1} | S_t = s, A_t = a]\]

Created: 2021-11-13

Emacs 26.1 (Org mode 9.5)