Policy Function

about | blog | config | notes | github

The Policy is mapping from states to actions that defines the agent's behaviour, formmally defined as \(a = \pi(s)\). The policy can also be stochastic:

\[\pi(a|s) = \mathbb{P}[A_t = a | S_t = s]\]

Created: 2021-11-13

Emacs 26.1 (Org mode 9.5)