# Reinhard Selten

**Reaching Good Decisions without Utility and Probability Judgements - The Approach of Bounded Rationality**

Category: Lectures

Date: 26 August 2011

Duration: 34 min

Quality: MD SD

Subtitles: EN

We give an introduction to the notion of Bounded Rationality. In accordance with experimental evidence, Bounded Rationality is a theory of decision-making not reliant on the decision-maker’s capability to form consistent probability and utility judgments

I am going to talk about, not about the great problems of the world, but the great problem within our ideas about decision making, which is so, it is about reaching good decisions without utility and probability judgements. And we present some theories there and some experimental evidence. Let me first say something about why we need the theory of Bounded Rationality. First of all I wanted to say that Bounded Rationality is understood here in the tradition of H. A. Simon, who began to write his path-breaking papers in 1954. Bayesian Decision Theory, which is now the mainstream idea about decision making, is based on consistency axioms. These consistency axioms cannot be satisfied, checking consistency of a body of preference judgements or probability judgements is an NP-complete task. So it cannot be done in polynomial time and that means that human beings are unable to be Bayesian decision makers. There are also other objections against Bayesian Decision Theory, it’s an idealised picture of rationality. Robin Pope and myself have written about this, but I don’t want to go into this here. In view of the impossibility of checking consistency assumptions, it is necessary to work towards a new theory of Bounded Rationality, describing the logic of non-optimising behaviour. So in my view, Bounded Rationality is about non-optimal procedures of finding decisions. Our current theoretical and experimental work aims at this goal. The principles of Bounded Rationality are search for alternatives, satisficing and aspiration adaptation. Aspiration adaptation theory is a general model of non optimising behaviour based on these principles. There are several pieces of literature which refer to that, here are three. The oldest literature on aspiration adaptation theory is a paper by Sauermann and myself in German: I mean, since then some additions have been made to this theory, but the basic structure is still there. One can read about this in the Journal of Mathematical Psychology, or also in the book, in a chapter of the book, edited by Gigerenzer and myself, on Bounded Rationality. Now I want to go into the basic ideas of aspiration adaptation theory. I mean, I will only give an example, I will not give the exact definitions here. First of all, we have two goal variables. I mean, in aspiration adaptation theory, people do not follow one goal, there is no utility functions, but there are several goal variables which are not really comparable, and say you cannot really substitute between them, and here we have two such goals. For example, profit here and maybe market share here. And then, each of these goal variables has a scale. An aspiration scale, as certain discreet values of this goal variable, at which some values of some levels, I mean, as I call it partial aspiration levels for this goal variable can be fixed. And then we have here this aspiration grid. So what, if a decision maker has these goals? He has to fix an aspiration level for the two goal variables, I mean a vector of partial aspiration levels, and then, on this drawing you see also in grey the feasibility sets. The feasibility set is the set of those aspiration levels which are feasible, but this is not known in the beginning, it’s not a priori known to the decision maker, but he can find out whether something is feasible and also whether it is improvable in a certain sense, which I still will explain. Now suppose we begin here with a start. An aspiration start which is feasible. And now the decision maker is supposed to have an urgency order. I mean, here it’s very simple to answer, one urgent variable and the other is less urgent. But if you would have more, five goal variables, then it would have to be an ordering of the goal variables. And here, the green arrow means that in this direction there is the urgent direction. Then he is, but he does not know this but he can find out whether something is feasible. This is assumed here. And so the decision maker goes to this aspiration level, and here this direction becomes urgent. So he finds out that this is feasible and then he comes to this aspiration level. He finds out that this is also feasible, the most urgent direction goes here, he goes to this aspiration level. He finds out it is not feasible, so he tries the second urgent direction, the less urgent direction and comes to this point. From here he can try, I mean it doesn’t matter what is urgent, he tries first his urgent goal variables, then his second urgent, and finds out that in both cases an improvement is not possible, so he has reached a point which is feasible but it is not, but not improvable. And then the process stops and the plan, which is associated through this aspiration level, there is always a plan associated to this, is executed. Here is something like an efficiency frontier and if somebody had the utility function for this goal variables, he would have to try whether this is better than that. That, no such attempt is made here, as soon as you have reached something which is feasible but not improvable, you stop there. And also here you begin in start, which is taken over from the last period maybe. The aspiration start is outside the feasible sets and you have to go down in retreat in the direction of a retreat variable to every aspiration level that belongs to the retreat variable, this is this direction here and then you come to an aspiration level which is also not feasible, you have to go down further. And then you still go down until you reach here a feasible. And from there you can move upward again, so you move first to this most urgent direction. Here the most urgent direction seems out not to be feasible. And then you go in this direction and here you come again to a point which it cannot be improved in any direction and suppose it stops there. So that is a reasonable way in which a decision can be reached, but for a long time it was not so easy to but we could never really show that this is a significant result, because you have to create a situation in which there are several roles and in which you can make these goals explicit. I mean, you can give the people an opportunity to make their goals explicit. And this, we now found such basic experimental design. A participant faces a dynamic decision problem, this is common to our designs. The payoff base is the sum of all period profits, payoff base means that the payoffs is show up fee plus something which is proportional to the payoff base in money. And then the experimental software, there is a software which comes with this experiment, enables the participant to select a goal system, this is a system of goal variables. These are short term feedback variables. In our example I said profit for the current period or maybe market share at the end of the current period. So it is something which you can - the goal system, the goal variables are variables which are short term feedback variables, which can be observed after the period is over. And he will select some such variables, or he can also form arithmatic combinations of them. I mean, aggressional functions of several goal variables, and he will select them in such a way that he can control all the variables which are thought to be essential for long run success. He wants to have a high amount of profit over all periods, but he doesn’t know how to go about it because he gets only qualitative information. I will come to this, what the nature of this information is. And also even if he got the quantitative information, he would not be able to handle it. And he can also specify an aspiration level of a vector of satisfactory values for each goal variable and in some, he has to do it eventually but he can always, before he does something he can always change his system of goal variables all the time, and if he had an aspiration, he can change the aspiration level and so on. So the software leaves him much freedom to deviate from the theory of aspiration adaptation. The first experimental context, and I have talked three years ago about this here, is a dynamic monopoly task with fifty periods. Only qualitative information, direction of influences and presence of delays is given, and the basic short term feedback variables which are available, for example profit, quality, brand awareness, demand potential, total cost, unit cost and so on. I mean there is quite a number of these variables and if there are, the subject has selected some goal variables and fixed some aspiration level for this goal system, then the computer will give him information about the feasibility. I mean, the computer knows what is feasible, so the idea is that this presents a situation in which the top management asks its departments whether they can do, but it’s supposed, for doing this, they do not really fix the instrument. I mean, obviously in many firms prices are not fixed by the top management. These are lower level managers who fix prices and so on. They just may decide that profits in sales should have a certain level, and this would be one of the goal variables, and then they will tell each department what they have to achieve in this way. And there is correct computer information about feasibility and improvability. And this work is already published, we have here a paper by myself, Pittnauer and Hohnisch in 2011 in the Journal of Behavioural Decision Making. And then we have a second experimental context, and this is a decision environment with uncertainty. The subjects present a financial institution in an environment with rare but strong crisis. In crisis periods default rates of investments are much higher with a possibility of bankruptcy for the financial institution. Knowledge about the environment is limited. Default rates are unknown a priori. Participants are only told that with a very high probability they will encounter a crisis period at least once. They can get experience about the default rates in normal periods, because they will But they know nothing about the default rates in the crisis before the crisis period happens. And short term feedback variables are here profits, given a default rate scenario, of course you cannot have here information about profits because you don’t know what the default rates are. But you can have profits on the condition that the default rates will be at most at certain values. In this experiment there were two kinds of investment, and they could also borrow money and put it into cash in order to avoid insolvency. Then debt-over-equity, maximum default rates sustaining solvency. Maximum default rates sustaining solvency, it’s the pair of default rates for the two investments, is a risk related variable. Higher aspiration levels imply higher security. No probability judgements need to be made in order to fix deciding upon aspiration levels on these variables, rather one has to categorise the events in two classes: Those to be considered and others. And it’s much easier to make a judgement on such a maximum level, with which you have to calculate than about probability distributions. I now want to talk about what we found out about the validity of aspiration adaptation theory. The percentage of period final goal systems consisting of multiple variables was 92% in the monopoly tie and 91% in the investment context. It means there were relatively few subjects which - I mean, the number of one goal systems, which is in one way, were very low, we see that mostly several goals are selected. And in these cases, when several goals are selected, these three hypothesis become applicable, which I now show here. And the first one is that if the aspiration level is feasible and improvable, then exactly one partial aspiration level is upward adapted to a better, exactly one. Not several ones at once, exactly one. And if it is not feasible, then exactly one partial aspiration level is downward adapted, which follows what I have explained. And if the aspiration level is feasible and not improvable, then the transition to the next period takes place. It means that this plan will be executed. And here we have a table which shows the results. We have here the hit rates, which are quite high in both cases, but we have to subtract the mean area. I mean, this is a proportion of predicted possibilities within all possibilities of change, this has to be subtracted, because otherwise one can get of course a very high hit rate. He can only get a hit rate of 1 if one predicts an area comprising all possibilities. Now, if we do this, then we come to the mean predictive success, which is quite high here. Compared to the predictive success in other experimental studies, this is quite high. So this is really a very good confirmation of these three hypothesis, so that shows that some basic hypothesis of aspiration adaptation theory are validated. Here we have, we define something which is an adequate goal system. An adequate goal system allows to control all the important aspects of the situation. For example here in this monopoly, this refers to the monopoly task, there are three things which have to be controlled, and this is the profit, of course you have to make a profit in the short run, if you want to make one in the long run, but then also you have to control investment in quality, because if you don’t keep up the quality, high quality, then maybe if quality gets very low, then you may make a high profit in this period, because you neglect investment in quality. But it will be very bad later, if people, your customers, experience a bad quality. And similar to the expenditure for - you have to keep track of the brand awareness, and for this purpose you have to invest in advertising. But advertising, I mean, also like investment in quality, has an immediate effect but also a long run effect. And here you see that a goal system is adequate if it allows to control these three variables. This is causally adequate. And we see that this - here you have the function of periods with causal structure induced goal systems, and here you have the relative total profit. And you see that the relative total profit tends to be higher if this fraction of periods with causal structure induced goal systems becomes higher. So the highest profit was about, I mean, more than 94% of the profit, which can be achieved in this model, if you apply a dynamic programming to the exact functions in this model. And he got this and in all periods he had a causal structure induced goal system. And this here, this worst one had it in very few periods, like you see. Or in zero periods he had it, so we have here a clear tendency in this direction. Now, here you see that in the financial institution context, those people who ended up solvent, in comparison to those which ended up insolvent, had a higher predictive success with respect to the three hypothesis which I had. And then there is here the last graph I am showing, and this is also for the investment task. You have here, we have two conditions. One with the goal variables to be fixed and so on, this is the red triangles here, and another condition, which we used as a control, was where we just let them fix the instruments. If you didn’t give them any facility to define goals, so you just had to fix instruments. And we see that this is the instrument base, and here, what we have here is the risk exposure, that is defined for an individual, it is the expected value of its profits if a crisis occurs. What we have here is the average risk exposure in these conditions in the graph. Now, you see that this begins very high and goes down and down, and this keeps, under the goal fixation it stays more level. And the risk exposure is not so much increasing like in the instrument case. In the instrument case, people learn in an intuitive way. I mean, this is intuitive learning, and then, if nothing happens, they take more risk. They don’t realise that they didn’t have learned anything about the risk, but if they have to fix goals, then they realise that they have nothing learned about goals. I mean, they didn’t get any new information about risky periods, and that gives them a better performance here. So since consistency, I now give a short summary, since consistent utility and probability judgements are impossible to make, Bounded Rationality poses that achieving long term success requires identifying short term goal variables that need to be controlled to attain the long run goal, and giving attention to these short term goal variables in a coherent way over time, as prescribed by aspiration adaptation theory. These bounded rational decisions for these principles can be implemented by human beings and are reasonable. They lead to better results than impulsive decisions based on recent feedback only. So thank you very much.