Do you think you make all your decisions rationally? Imagine, for example, that you have a choice between different fruits. You probably have a pre-established order of preferences and you will make your choice according to this order. If you prefer apple to banana, and banana to cherry, chances are you prefer apple to cherry. But is the same true for your economic choices? Can your experience influence your decisions when it comes to money? Stefano Palminteri - a researcher at the LNC2 - and the Humain Reinforcement Learning team have been looking into these questions. This research was published on April 2, 2021 in the journal Science Advances.
The researchers developed an experimental protocol in which nearly a thousand participants had to repeatedly choose abstract symbols implicitly associated with different monetary gains. To learn the value associated with each symbol, they were arranged in fixed pairs, constituting several learning contexts.
In a first context, one of the symbols yielded an average of 7.5 points while the other yielded an average of 2.5 points. In a second context, the monetary values associated with the symbols were ten times smaller, i.e. 0.75 points and 0.25 points. The points accumulated during the experiment were then converted into real money, so that the participants learned, by trial and error, to choose the symbol associated with the biggest win. As a result, at the end of the learning phase, participants developed a rational preference for the symbol that yielded the most money.
In a second phase of the experiment, the research team asked the same participants to choose between the symbol yielding 2.5 points and the one yielding 0.75 points. Contrary to the predictions of the standard model of rational choice, the participants showed, on average, a strong preference for the symbol worth 0.75 points, even though it is about 3 times less advantageous. Moreover, the authors showed that this irrational preference was more pronounced the easier the learning phase was. This experiment highlighted a counter-intuitive result: the better we learn in each context, the more wrong we are when we have to generalize.
To explain these irrational decisions, the authors proposed a mathematical model of learning that normalizes economic values to the context in which they are learned. Thus, when we establish our order of preference, it will depend on the other options we have at our disposal. In other words, the economic values of symbols are learned relative to the context in which they are found. This is why a symbol yielding 2.5 points, which was compared to a symbol yielding 7.5 points, is perceived as less desirable than a symbol that yields 3 times less but was previously perceived as the most advantageous in its context. In fact, a symbol becomes perceived as more of a "winner" or more of a "loser" depending on the context in which it was encountered, regardless of its actual value.
So what is the benefit of learning to make the most advantageous choices, if it leads us to make mistakes when decisions are made out of context? To answer this question, the authors emphasize that the brain seeks to perform as well as possible on the task. Thus, taking the shortcut to "win" or "lose" allows us to increase our performance and make better decisions, more quickly. Knowing that the brain does not know what the next task will be, the gain induced by this process compensates for the eventual arrival of a generalization. Is this process also responsible for irrational decisions in everyday situations? Can the psychological processes highlighted by this study be useful for artificial intelligence research? These are the questions that Stefano Palminteri and his team are now trying to answer.
Sophie Bavard, Aldo Rustichini, Stefano Palminteri (2021). Two sides of the same coin: Beneficial and detrimental consequences of range adaptation in human reinforcement learning. Sciences Advances, 7, 14, eabe0340, 10.1126/sciadv.abe0340.