Likewise the environment at timestep \(t\) :
- recieves action \(a_t\)
- emits observation \(o_t\)
- emits reward \(r_t\)
1. Environment State
The environment state \(S^{e}_{t}\) is the environments private representation. This could be all the internal variables inside the Atari game that agent cannot see. The RL algorithms can not depend on these numbers as it does not have access to this information!
This state however is what is used to generate the observation \(o_t\) and reward \(r_t\) however.