Reinhard Selten

From Learning Direction Theory to Generalized Impulse Balance

Thursday, 21 August 2014
12:00 - 12:30 hrs CEST

Abstract


This paper describes a line of research on learning by experience in repeated games and dynamic decision situations. The behavior is modelled as an unconsciously performed decision algorithm involving very little conscious deliberation. We consider three types of theoretical concepts

1) Learning Direction Theory
2) Impulse Equilibrium
3) Generalized Impulse Balance

A player i’s period payoff depends only on the period strategies π0 of a random player and the strategies π1,…,πn of n personal players, but not on t. At the beginning of each period t, a personal player i compares his payoff to the obtained one in period t-1 with the payoff yi he could have maximally obtained in t-1 by other strategies ρi given the strategies actually used by the other players. A positive difference zi=yi-xi indicates an impulse from πi to ρi

Let hii-i) be player i’s period payoff if he plays πi and where π-i  stands for the combination of the period strategies of all other players. Then

(1) si=max(πi∈Πi)  min(π-i∈Π-i) hii-i )

is player i’s pure strategy maximum, also referred to as player i’s security level si

Πi is the set of all possible πi and Π-i is the setoff all possible π-i. Player i’s payoff cannot be reduced to a level below si by the behavior of the other player j even if they know which strategy πi player i is going to play. Therefore si is a natural benchmark for the distinction between losses and gains. Any payoff xi with xi < si involves a loss li = si - xi and xi > si is connected to a given gi = xi – si. Define:

|μ-η|+=max[μ-η,0]

|μ-η|-=min[μ-η,0]

The success measure connected to the theories covered by the paper is the transformed payoff:


The strength of an impulse from πn to ρi is measured by the difference of the transformed payoffs from πi to ρi. In the context of the paper by Chmura and myself (2008) the concept of impulse balance was defined for 2x2 game only. In this case the difference zi = yi - xi in terms of the untransformed payoff is always positive. However in the area of 3 players we cannot avoid a definition based on the transformed payoff.

Related Laureates