Date: 2008
Type: Working Paper
Learning within a Markovian Environment
Working Paper, EUI ECO, 2008/13
RIVAS, Javier, Learning within a Markovian Environment, EUI ECO, 2008/13 - https://hdl.handle.net/1814/8084
Retrieved from Cadmus, EUI Research Repository
We investigate learning in a setting where each period a population has to choose
between two actions and the payoff each action is unknown by the players. The population
learns according to reinforcement and the environment is non-stationary, meaning
that there is correlation between the payoff each action today and the payoff each
action in the past. We show that when players observe realized and foregone payoff, a
suboptimal mixed strategy is selected. On the other hand, when players only observe
realized payoff, a unique action, which is optimal if actions perform different enough, is
selected in the long run. When looking for efficient reinforcement learning rules, we find
that it is optimal to disregard the information from foregone payoff and to learn as if
only realized payoff were observed.
population
learns according to reinforcement and the environment is non-stationary, meaning
that there is correlation between the payo of each action today and the payo of each
action in the past. We show that when players observe realized and foregone payo s, a
suboptimal mixed strategy is selected. On the other hand, when players only observe
realized payo s, a unique action, which is optimal if actions perform di erent enough, is
selected in the long run. When looking for e cient reinforcement learning rules, we nd
that it is optimal to disregard the information from foregone payo s and to learn as if
only realized payo s were observed.
Cadmus permanent link: https://hdl.handle.net/1814/8084
ISSN: 1725-6704
Series/Number: EUI ECO; 2008/13
Publisher: European University Institute