--

Hi Arald, thank you for reading my article!

I'm not sure whether I understand the question completely correctly, but I don't think I would call it a bias.

The deterministic part of the state transition (depending on current state+action) is made explicit with post-decision states; I can see how this might be perceived as a sort of 'pre-positioning' before drawing the stochastic part (random environment information). However, the deterministic step is not really taken before the actual transition, it is an integral part of the step from s_t to s_t+1. Most textbooks just implicitly combine both parts into a single transition process.

--

--

Wouter van Heeswijk, PhD
Wouter van Heeswijk, PhD

Written by Wouter van Heeswijk, PhD

Assistant professor in Financial Engineering and Operations Research. Writing about reinforcement learning, optimization problems, and data science.

No responses yet