Add Journal to My Library
Journal of the European Economic Association
, Volume Advance Article – May 22, 2018

55 pages

/lp/ou_press/the-many-faces-of-human-sociality-uncovering-the-distribution-and-tluJ2Yku0x

- Publisher
- Oxford University Press
- Copyright
- © The Author(s) 2018. Published by Oxford University Press on behalf of European Economic Association.
- ISSN
- 1542-4766
- eISSN
- 1542-4774
- D.O.I.
- 10.1093/jeea/jvy018
- Publisher site
- See Article on Publisher Site

Abstract We uncover heterogeneity in social preferences with a structural model that accounts for outcome-based and reciprocity-based social preferences and assigns individuals to endogenously determined preferences types. We find that neither at the aggregate level nor when we allow for several distinct preference types do purely selfish types emerge, suggesting that other-regarding preferences are the rule and not the exemption. There are three temporally stable other-regarding types. When ahead, all types value others’ payoffs more than when behind. The first, strongly altruistic type puts a large weight on others’ payoffs even when behind and displays moderate levels of reciprocity. The second, moderately altruistic type also puts positive weight on others’ payoff, yet at a lower level, and displays no positive reciprocity. The third, behindness averse type puts a large negative weight on others’ payoffs when behind and is selfish otherwise. In addition, we show that individual-specific estimates of preferences offer only very modest improvements in out-of-sample predictions compared to our three-type model. Thus, a parsimonious model with three types captures the bulk of the information about subjects’ social preferences. 1. Introduction A large body of evidence suggests that social preferences can play an important role in economic and social life.1 It is thus key to understand the motivational sources and the distribution of social preferences in a population, and to capture the prevailing preference heterogeneity in a parsimonious way. Parsimony is important because in applied contexts tractability constraints typically impose serious limits on the degree of complexity that theories can afford at the individual level. At the same time, however, favoring the most extreme form of parsimony—by relying on the assumption of a representative agent—is particularly problematic in the realm of social preferences because even minorities with particular social preferences may play an important role in strategic interactions. The reason is that social preferences are often associated with behaviors that change the incentives even for those who do not have those preferences.2 This means that even if only a minority has social preferences they can play a disproportionately large role for aggregate outcomes. Thus, we need to be able to capture the relevant components of social preference heterogeneity while still maintaining parsimony and tractability. It is our objective in this paper to make an important step in this direction. For this purpose we use a structural model of social preferences that is capable of capturing both preferences for the distribution of payoffs between the players and preferences for reciprocity. These types of social preferences have played a key role in the development of this subject over the last 15–20 years and their relative quantitative importance is still widely debated (Fehr and Schmidt 1999; Bolton and Ockenfels 2000; Charness and Rabin 2002; Falk et al. 2008; Engelmann and Strobel 2010). However, in the absence of an empirically estimated structural model it seems difficult to make progress on such questions. Therefore, we implement an experimental design that enables us to simultaneously estimate distribution-related preference parameters and the parameters related to (positive and negative) reciprocity preferences. Similar to Andreoni and Miller (2002) as well as Fisman et al. (2007), our design involves the choice between different payoff allocations on a budget line. The size of the parameters estimated from a sequence of binary choices then informs us about the relative importance of different preference components.3 However, most importantly, our experiment provides a rich data set that allows us to characterize the distribution of social preferences in our study population of 174 Swiss university students at three different levels: (i) the representative agent level, (ii) the intermediate level of a small number of distinct preference types, and (iii) the individual level. From the viewpoint of achieving a compromise between tractability and parsimony, and the goal of capturing the distinct qualitative properties of important minority types the intermediate level is most interesting. We approach this level by applying finite mixture models that endogenously identify different types of preferences in the population without requiring any prespecifying assumptions about the existence and the preference properties of particular types. This means, for example, that we do not have to assume, say, a selfish or a reciprocal type of individuals. Rather, the data themselves “decide” which preference types exist and how preferences for the distribution of payoffs and for reciprocity are combined in the various types. Taken together, our finite mixture approach enables us to simultaneously identify (i) the preference characteristics of each type, (ii) the relative share of each preference type in the population, and (iii) the (probabilistic) classification of each subject to one of the preference types. The third aspect has the nice implication that our finite mixture approach provides us with the opportunity to make out-of-sample predictions at the individual level without the need to estimate each individual's utility function separately. Which preference types do our finite mixture estimates yield? We find that a model with three types best characterizes the distribution of preferences at the intermediate level. A model with three types is best in the sense that it produces the most unambiguous classification of subjects into the different preference types.4 Moreover, this classification and the preference properties of the three types are temporally stable. In contrast, models with two or four types produce a more ambiguous classification of subjects into types and are associated with severe instabilities of the preference properties of the various types across time. At the substantive level, what are the preference properties of the different types and how large are their shares in our study population? The preferences of the three types are best described as (i) strongly altruistic, (ii) moderately altruistic, and (iii) behindness averse. Interestingly, all three types show some other-regarding behaviors, that is, purely selfish types do not emerge. This nonexistence of a purely selfish type is not an artefact of our methods because we can show with Monte Carlo simulations (see the Online Supplementary Data) that our finite mixture approach would identify the selfish type if it existed. In addition, all three types weigh the payoff of others significantly more in the domain of advantageous inequality (i.e., when ahead) than in the domain of disadvantageous inequality (i.e., when behind), and for all of them the preference parameters that capture preferences for the distribution of payoffs are generally quantitatively more important than preferences for reciprocity. The strong altruists, which comprise roughly 40% of our subject pool, put a relatively large positive weight on others’ payoffs regardless of whether they are ahead or behind. In terms of willingness to pay to increase the other player's payoff by $1, the strong altruists are on average willing to spend 86 cents when ahead and 19 cents when behind. In addition, they also display moderate levels of positive and small levels of negative reciprocity, that is, for them negative reciprocity is the weaker motivational force than positive reciprocity. The moderate altruists, which comprise roughly 50% of our subject pool, put a significantly lower, yet still positive weight on others’ payoffs. They display no positive but a small and significant level of negative reciprocity. A moderate altruist is on average willing to pay 15 cents to increase the other player's payoff by $1 when ahead and 7 cents when behind. It may be tempting to treat this low-cost altruism as unimportant. We believe, however, that this would be a mistake because social life is full of situations in which people can help others at low cost. Many may, for example, be willing to give directions to a stranger and help a colleague, both of which are associated with small time cost, or donate some money to the victims of a hurricane although they may not be willing to engage in high-cost altruism. Finally, the behindness averse type comprises roughly 10% of the subject pool and is characterized by a relatively large willingness to reduce others’ income when behind—spending 78 cents to achieve an income reduction by $1—but no significant willingness to increase others’ income when ahead or when treated kindly. As mentioned previously, one remarkable feature of our finite mixture estimates is that no purely selfish type emerges, suggesting that other-regarding preferences are the rule not the exception. This conclusion is also suggested by the preference estimates for the representative agent, which are characterized by intermediate levels of altruism—in between the strong and the moderately altruistic types. The absence of an independent selfish type does of course not mean that there are no circumstances—such as certain kinds of competitive markets—in which the assumption of self-interested behavior may well be justified.5 However, it means that if one makes this assumption in a particular context there is a need to justify the assumption because many people may not behave selfishly in these contexts because they are selfish but because the institutional environment makes other-regarding behavior impossible or too costly. Our preference estimates for the representative agent model reinforce the conclusion regarding the relative importance of distributional versus reciprocity preferences. For the representative agent distributional preferences are considerably more important than reciprocity preferences. In the absence of any kindness or hostility between the players the representative agent is, for example, willing to spend 33 cents to increase the other player's payoff by $1 when ahead. If—in addition—the other player has previously been kind the representative agent's willingness to pay increases to 50 cents at most. Moreover, preferences for negative reciprocity do not seem substantially stronger than preferences for positive reciprocity.6 Relying on the preference estimates of the representative agent may however, be seriously misleading because according to these estimates behindness averse behaviors can only occur as a random (utility) mistake whereas in fact a significant minority of the subject pool—the behindness averse type—has clear preferences for income reductions when behind. An important aspect of our study are the out-of-sample predictions based on the type-classification mentioned previously, because such out-of-sample predictions are among the most stringent tests of a model. To study the extent to which our type-specific social preference estimates are capable of predicting individual behavior in other games, the subjects also participated in several additional games. In the first class of games they participated as second-movers in a series of ten trust games with varying costs of trustworthiness; in the second class of games they participated in two games in which they could reward and punish the previous behavior of another player. We are particularly interested in the question whether individual predictions based on our type-specific preference estimates are as good as individual predictions based on individual preference estimates. If this were the case, our type-based model would not only capture the major qualitative social preference types in a parsimonious way but there would also be no need to further disaggregate the preference estimates for predictive purposes. The results show indeed that our three-type model achieves this goal. If we predict each individual's behavior in the additional games on the basis of their types’ preferences, we substantially increase the predictive power over a model that just uses demographic and psychological personality variables as predictors. Moreover, despite its parsimony, the predictive power of the type-based model is almost as good as the predictions that are based on estimates of each individual's preferences. Thus, taken together the out-of-sample predictions indicate a remarkable ability of the three-type model to predict individual variation in other games. The predictive exercise also enables further insights into the strengths and the weaknesses of the type-based model. On the positive side, we find that the strength of specific behaviors such as rewarding others for a fair act is in line with the type-based model. The strong altruists reward more than the moderate altruists whereas the behindness averse types do not reward at all. Likewise, as predicted by the model, the behindness averse types display a considerably higher willingness to punish unfair actions (when behind) than the strong or moderate altruists. However, we also find patterns that cannot be fully reconciled with the type-based model. In particular, the behindness averse types should never reciprocate trust in the trust games because they do not put a positive value on other's payoff, but in fact we observe that they are trustworthy at moderate cost levels. These findings indicate limits in our model's ability to predict individuals’ behavior out-of-sample. However, these limits also provide potentially useful hints about ways to improve our approach. We discuss this in Section 4.5 of the paper. How does our paper relate to the existing literature? Our paper benefits from the insights of the previous literature on the structural estimation of social preferences at the individual level such as Andreoni and Miller (2002), Bellemare et al. (2008, 2011), and Fisman et al. (2007, 2015). However, in contrast to this literature, the purpose of our paper is to provide a parsimonious classification of individuals to—endogenously determined—preference types and a characterization of the distribution of social preferences in terms of individuals’ assignment to a small number of types. The results of the paper show that basically all individuals are unambiguously assigned to one of three mutually exclusive types and that individual preference estimates do not lead to superior out-of-sample predictions relative to the much more parsimonious three-type model.7 There are also several other differences between these papers and our paper. First, our paper simultaneously identifies outcome-based social preferences and preferences for reciprocity whereas the previously mentioned papers—with the exception of Bellemare et al. (2011)—focus exclusively on outcome-based social preferences. Second, in contrast to our assumption of piecewise linearity, the structural models of Andreoni and Miller (2002) and Fisman et al. (2007, 2015) are based on a CES utility function with own and others’ payoff as arguments, which rules out behindness aversion, but has the advantage of enabling the identification of potential nonlinearities in indifference curves.8 Third, and relatedly, the experimental design in Fisman et al. (2007) involves budget lines that allow for interior choices, whereas our study does not allow for interior solutions. The piecewise linear model we use is perhaps more tractable than nonlinear models for the finite mixture estimation with data from our binary choice task, but its applicability to situations that are fundamentally different, for example, allocation decisions from convex choice sets, may be limited. Fourth, the structural model in Bellemare et al. (2008) rules out the existence of altruism and pure selfishness. This assumption in the paper by Bellemare et al. (2008) may explain why they find a large amount of inequality aversion whereas we do not find evidence for a separate type that simultaneously dislikes advantageous and disadvantageous inequality. However, Bellemare et al. (2008) showed that young and highly educated subjects in their representative subject pool display significantly less inequality aversion than other socioeconomic groups. Therefore, the absence of a simultaneous dislike of advantageous and disadvantageous inequality in our data set could be due to the fact that our subject pool consists exclusively of university students. A recent paper by Kerschbamer and Muller (2017) supports this conjecture. This paper is based on a large heterogeneous sample of 3,500 individuals in the German Internet Panel and the nonparametric approach to the elicitation of social preferences developed by Kerschbamer (2015). Roughly 2/3 of the subjects in this data set exhibit inequality aversion, that is, a simultaneous dislike of advantageous and disadvantageous inequality. Our paper is also related to the literature that characterizes latent heterogeneity in social preferences using finite mixture models. Previous studies in this literature mainly focus on distributional preferences and typically classify subjects into predefined preference types. For instance, Iriberri and Rey-Biel (2011, 2013) elicit distributional preferences with a series of modified three-option dictator games and apply a finite mixture model to classify subjects into four predefined types. Similarly, studies by Conte and Moffatt (2014), Conte and Levati (2014), and Bardsley and Moffatt (2007) use behavior in public good and fairness games, respectively, to classify subjects into predefined types. Such a priori assumptions may or may not be justified. For example, all of these studies assume the existence of a purely selfish type but as our analysis indicates a purely selfish type may not exist if one allows for sufficiently small costs of other-regarding behaviors. Likewise, often behindness aversion is not a feasible type by assumption and therefore the structural model cannot identify such types. The only study we are aware of that identifies types endogenously instead of predefining them is by Breitmoser (2013). This study relies on existing dictator game data from Andreoni and Miller (2002) and Harrison and Johnson (2006) for testing the relative performance of different preference models with varying error specifications. However, this study as well as the others mentioned in this paragraph do not simultaneously estimate distributional and reciprocity-based preferences, nor do they compare the power of the type-specific and individual estimates in making out-of-sample predictions across games. Finally, our paper also contributes to the literature concerned with the stability of social preferences. Most studies in this literature analyze behavioral correlations. For example, Volk et al. (2012) and Carlson et al. (2014) report that contributions to public goods appear to be stable over time in the lab as well as in the field. Moreover, there is evidence that behaviors such as trust (Karlan 2005), charitable giving (Benz and Meier 2008), and contributions to public goods (Laury and Taylor 2008; Fehr and Leibbrandt 2011) seem to be correlated between the lab and field settings. Blanco et al. (2011) study the within-subject stability of inequality aversion across several games in order to understand when and why models of inequality aversion are capable of rationalizing aggregate behavior in games. However, most of these studies do not estimate a structural model of social preferences, which would be necessary for making precise quantitative behavioral predictions. As a consequence, they do not characterize the distribution and the overall characteristics of social preferences in the study population. The remainder of the paper is organized as follows. Section 2 discusses our behavioral model and describes the experimental design. Section 3 covers our econometric strategy for estimating the behavioral model's parameters at different levels of aggregation. Section 4 presents the results and discusses their stability over time and across games. Finally, Section 5 concludes. 2. Behavioral Model and Experimental Design 2.1. Behavioral Model To characterize the distribution of social preferences at the aggregate, the type-specific and the individual level and to make out-of-sample predictions across games, we need a structural model of social preferences. To achieve our goals we apply a two-player social preference model inspired by Fehr and Schmidt (1999) and Charness and Rabin (2002), which we extended to make it also capable of capturing preferences for reciprocity. In the outcome-based part of the model, Player A’s utility, \begin{equation}{U^{\rm{A}}} = \left( {1 - \alpha s - \beta r} \right) \times {{\Pi }^{\rm{A}}} + \left( {\alpha s + \beta r} \right) \times { {\Pi}^{\rm{B}}},\end{equation} (1) is piecewise linear, where ΠA represents player A’s payoff, and ΠB indicates player B’s payoff. s = 1 if ΠA < ΠB, and s = 0 otherwise (disadvantageous inequality); r = 1 if ΠA > ΠB, and r = 0 otherwise (advantageous inequality). Depending on the values of α and β, subjects belong to different preference types: A subject whose α and β are both zero is a purely selfish type, because she does not put any weight on the other player's payoff. If α < 0 the subject is behindness averse, as she weights the other's payoff negatively whenever her payoff is smaller than the other's. Analogously, if β > 0 the subject is aheadness averse, since she weights the other's payoff positively whenever her payoff is larger than the other's. Consequently, a subject who is both behindness and aheadness averse with α < 0 < β and −α < β is a difference averse type for whom disadvantageous inequality matters less than advantageous inequality. In case α < 0 < β and −α > β, the subject is difference averse too, but disadvantageous inequality matters more than advantageous inequality; this is the case discussed in Fehr and Schmidt (1999). A subject with α > 0 and β > 0 is an altruistic type, as she always weights the other's payoff positively. In contrast, a subject with α < 0 and β < 0 is a spiteful type, since she puts a negative weight on the other's payoff, regardless of whether she is behind or ahead. Finally, a subject with α > 0 > β exhibits quite implausible preferences, since she weights the other's payoff positively when she is behind, and negatively when she is ahead. We do not expect to observe such preferences in our data. Because we are also interested in the subjects’ willingness to reciprocate kind or unkind acts, we extend model (1) to account for positive and negative reciprocity. The extension is similar to Charness and Rabin (2002) who take only negative reciprocity into account and Bellemare et al. (2011) who consider both positive and negative reciprocity. Player A’s utility in the extended model is \begin{equation}{U^{\rm{A}}} = \left( {1 - \alpha s - \beta r - \gamma q - \delta v} \right)\ \times {{\Pi }^{\rm{A}}} + \left( {\alpha s + \beta r + \gamma q + \delta v} \right) \times {{\Pi }^{\rm{B}}},\end{equation} (2) where q and $$v$$ indicate whether positive or negative reciprocity play a role. More formally, q = 1 if player B behaved kindly toward A, and q = 0 otherwise (positive reciprocity); $$v$$ = 1 if player B behaved unkindly toward A, and $$v$$ = 0 otherwise (negative reciprocity). A positive value of γ in equation (2) means that player A exhibits a preference for positive reciprocity, that is, a preference for rewarding a kind act of player B by increasing B’s payoff. A negative value of δ represents a preference for negative reciprocity, that is, a preference for punishing an unkind act of player B by decreasing B’s payoff. In sum, the piecewise linear model does not only nest major distributional preferences, but it also quantifies the effects of positive and negative reciprocity. 2.2. Experimental Design This section describes the experimental design. The experiment consists of two sessions per subject that took place three months apart from each other, one in February and one in May 2010. To test for temporal stability, both sessions included the same set of binary decision situations that allow us to estimate the subjects’ preference parameters. In each binary decision situation, the subjects had to choose one of two payoff allocations between themselves and an anonymous player B. We implemented two types of such binary decision situations: (i) dictator games for identifying the parameters α and β, and (ii) reciprocity games for identifying γ and δ. In addition to these two types of binary decision situations, the second session in May 2010 comprised a series of trust games plus two reward and punishment games for checking the stability of the estimated preferences across games. 2.2.1. Dictator Games In each dictator game, a subject in player A’s role can either increase or decrease player B’s payoff by choosing one of two possible payoff allocations, $$X = ( {{\Pi}_X^{\rm{A}}, {\Pi}_X^{\rm{B}}} )$$ or $$Y = ( {{\Pi }_Y^{\rm{A}}, {\Pi}_Y^{\rm{B}}} )$$. To identify the subject's distributional preferences, governed by α and β, we varied the cost of changing the other player's payoff systematically across the dictator games. Figure 19 illustrates the dictator games’ design. Each of the three circles represents a set of 13 dictator games in the payoff space. In each of these dictator games, a line connects the two possible payoff allocations, X and Y. The slope of the line therefore represents A’s cost of altering B’s payoff. For example, consider the decision between the two options marked in black: The slope of the line is −1, implying that player A has to give up one point of her own payoff for each point she wants to increase player B’s payoff. Hence, if A chooses the upper-left of these two allocations, we know that A’s α is greater than 0.5, since the marginal utility from increasing B’s payoff, α, needs to exceed the marginal disutility of doing so, 1 − α. If, in contrast, A opts for the lower-right allocation, then A’s α is lower than 0.5. Thus, by systematically varying the costs of changing the other player's payoff across all dictator games—that is, the slope of the line—we can infer A’s marginal rate of substitution between her own and the other player's payoff. This allows us to directly identify the corresponding parameters of the subjects’ distributional preferences, α and β. The 45° line separates the dictator games in which A’s payoff is always smaller than player B’s from the ones in which A’s payoff is always larger than B’s. Thus, the observed choices in the upper (lower) circle allow us to estimate the value of α (β) in a situation of disadvantageous (advantageous) inequality. The choices in the middle circle contribute to the identification of both α and β, as each of them involves an allocation with disadvantageous inequality as well as an allocation with advantageous inequality. We constructed the dictator games such that the identifiable range of the parameters is between −3 and 1. The bunching of the lines ensures that the estimated preferences yield the highest resolution around parameter values of zero that separate the different preference types. The high resolution around parameter values of zero also implies that our experimental design is particularly well suited for discriminating between purely selfish subjects and subjects that exhibit only moderately strong social preferences. Monte Carlo simulations, summarized in Sections B.1 and B.2 of the Online Supplementary Data, confirm that our experimental design indeed reliably discriminates between purely selfish preferences and moderately strong social preferences. 2.2.2. Reciprocity Games In addition to the dictator games, each subject played 39 positive and 39 negative reciprocity games. The reciprocity games simply add a kind or unkind prior move by player B to the otherwise unchanged dictator games. In this prior move, B can either implement the allocation $$Z = ( {{{\Pi }}_Z^{\rm{A}}, {{\Pi }}_Z^{\rm{B}}} )$$ or let the subject choose between the two allocations $$X = ( {{{\Pi }}_X^{\rm{A}}, {{\Pi }}_X^{\rm{B}}} )$$ and $$Y = ( {{{\Pi }}_Y^{\rm{A}}, {{\Pi }}_Y^{\rm{B}}} )$$. Letting the subject choose between X and Y instead of implementing Z is either a kind or an unkind act from the subject's point of view. Hence, if player B decides not to implement Z, the subject may reward or punish B in her subsequent choice between X and Y. In the positive reciprocity games, player A is strictly better off in both allocations X and Y than in allocation Z, whereas B is worse off in at least one of the two allocations X and Y than in allocation Z. Consider the example with X = (1050, 270), Y = (690, 390), and Z = (550, 530). If player B forgoes allocation Z and lets A choose between the allocations X and Y, she acts kindly toward A as she sacrifices some of her own payoff to increase A’s payoff. Thus, if player A has a sufficiently strong preference for positive reciprocity, that is, a positive and sufficiently large γ, she rewards B by choosing allocation Y instead of allocation X. In the negative reciprocity games, player A is strictly worse off in both allocations X and Y than in allocation Z, whereas B is better off in at least one of the two allocations X and Y than in allocation Z. For example, consider the case where X = (450, 1020), Y = (210, 720), and Z = (590, 880). If B does not implement Z and forces A to choose between the allocations X and Y, she acts unkindly toward A as she decreases A’s payoff for sure in exchange for the possibility of increasing her payoff from 880 to 1,020. Hence, if A has a sufficiently strong preference for negative reciprocity, that is, a negative and sufficiently small δ, she punishes B by opting for allocation Y instead of allocation X. We applied the strategy method (Selten 1967) in the reciprocity games to ask the subject how she would behave if player B gives up allocation Z, and forces her to choose between the allocations X and Y. Consequently, any behavioral differences in the choices among X and Y between the dictator games and the corresponding reciprocity games have to be due to reciprocity. Based on such behavioral differences we can identify the parameters γ and δ that reflect the subjects’ preferences for positive and negative reciprocity.10 Taken together, we developed a design based on binary decision situations that are cognitively easy to grasp. We systematically vary the payoffs such that we are able to identify the parameters for the subjects’ distributional preferences, α and β. Only small changes are necessary to extend the design such that we are additionally able to identify the reciprocity parameters, γ and δ. 2.3. Implementation in the Lab As already mentioned, we conducted two experimental sessions per subject that were three months apart. All subjects were recruited at the University of Zurich and the Swiss Federal Institute of Technology Zurich. Two hundred subjects participated in the first session in February 2010 (henceforth denoted session 1) and were exposed to 117 binary decision situations involving a block of 39 dictator games (see Section 2.2.1) and a block of 78 reciprocity games (see Section 2.2.2) as well as a questionnaire soliciting cognitive ability, demographic data, and personality variables (i.e., the big five personality dimension). Out of these 200 subjects, 174 subjects (87%) showed up in the subsequent session that took place in May 2010 (henceforth denoted session 2). In session 2, the subjects completed again the 117 binary decision situations mentioned previously. In addition, they played ten trust games plus the two reward and punishment games that are described in more detail in Section 4.5. We will use the preferences estimated from the dictator and reciprocity games to predict the behavior in the trust games and the reward and punishment games. The dictator and reciprocity games were presented in blocks and appeared in random order across subjects. In the dictator games, the subjects faced a decision screen on which they had to choose between the two allocations X and Y. In the reciprocity games, the subjects initially saw allocation Z during a random interval of 3–5 s, before they had to indicate their choice between the allocation X and Y.11 In session 1, after the subjects completed all dictator and reciprocity games, we additionally assessed the potential of the reciprocity games for triggering the sensation of having been treated kindly or unkindly by player B. To do so, we asked the subjects to indicate on a 5-point scale as how kind or unkind they perceived player B’s action of forgoing allocation Z in a sample of 18 reciprocity games. The subjects’ answers, available in Table A.1 in the Appendix, show that the reciprocity games have indeed succeeded in triggering the perception of having been treated kindly and unkindly by the other player B. As payment, each subject received a show-up fee as well as an additional fixed payment for filling out the questionnaire on her personal data. After finishing the session, three of the subject's decisions as player A were randomly drawn for payment and each of them randomly matched to a partner's decision who acted as player B. Both the subject as well as her randomly matched partner received a payment according to their decisions. The experimental exchange rate was 1 CHF per 100 points displayed on the screen.12 The average payoff in session 1 was 52.50 CHF (std.dev. 7.47 CHF; minimum 33.30 CHF; maximum 74.10 CHF) and 55.74 CHF in session 2 (std.dev. 7.50 CHF; minimum 28.60 CHF; maximum: 75.60 CHF). Both sessions lasted roughly 90 min. In session 1 (2), the fraction of female subjects was 52% (53%) and the average age was 21.70 (21.75) years. The subjects received detailed instructions. We examined and ensured their comprehension of the instructions with a control questionnaire. In particular, we individually looked at each subject's answers to the control questionnaire and handed it back in the (very rare) case of miscomprehension. Finally, all subjects answered the control questions correctly. They also knew that they played for real money with anonymous human interaction partners and that their decisions were treated in an anonymous way. 3. Econometric Strategy In this section, we first describe the random utility model in general, which we apply for estimating the parameters of the behavioral model. Subsequently, we present three versions of the random utility model that vary in their flexibility in accounting for heterogeneity. 3.1. Random Utility Model To estimate the parameters of the behavioral model, θ = (α, β, γ, δ), we apply McFadden's (1981) random utility model for discrete choices. We assume that player A’s utility from choosing allocation $${X_g} = ( {{{\Pi }}_{Xg}^{\rm{A}}, {{\Pi }}_{Xg}^{\rm{B}}, {r_{Xg}}, {s_{Xg}}, {q_{Xg}}, {v_{Xg}}} )$$ in game g = 1, …, G is given by \begin{equation} {\mathcal{U}^{\rm{A}}} \big( {{X_g};\theta ,\sigma }\big) = {U^{\rm{A}}} \big( {{X_g};\theta } \big) + {\varepsilon _{Xg}}, \end{equation} (3) where UA(Xg; θ) is the deterministic utility of allocation Xg, and ϵXg is a random component representing noise in the utility evaluation. The random component ϵXg follows a type 1 extreme value distribution with scale parameter 1/σ. According to this model player A chooses allocation Xg over allocation Yg if $${\mathcal{U}^{\rm{A}}}( {{X_g}; \theta , \sigma } ) \ge {\mathcal{U}^{\rm{A}}}( {{Y_g}; \theta , \sigma } )$$. Since utility has a random component, the probability that player A’s choice in game g, Cg, equals Xg is given by \begin{eqnarray} &=& \Pr \big( {{C_g} = {X_g};\theta ,\sigma ,{X_g},{Y_g}} \big)\nonumber\\ &=& \Pr \big( {{U^{\rm{A}}}\big( {{X_g};\theta } \big) - {U^{\rm{A}}}\big( {{Y_g};\theta } \big) \ge {\varepsilon _{Yg}} - {\varepsilon _{Xg}}} \big)\nonumber\\ &=& \frac{{\exp \big( {\sigma {U^{\rm{A}}}\big( {{X_g};\theta } \big)} \big)}}{{\exp \big( {\sigma {U^{\rm{A}}}\big( {{X_g};\theta } \big)} \big) + \exp \big( {\sigma {U^{\rm{A}}}\big( {{Y_g};\theta } \big)} \big)}}. \end{eqnarray} (4) Note that the parameter σ governs the choice sensitivity toward differences in deterministic utility. If σ is 0 player A chooses each option with the same probability of 50% regardless of its deterministic utility. If σ is arbitrarily large the probability of choosing the option with the higher deterministic utility approaches 1. A subject i’s individual contribution to the conditional density of the model follows directly from the product of the previous probabilities over all G games: \begin{eqnarray} f\big( {\theta ,\sigma ;X,Y,{C_i}} \big) &=& \mathop \prod \limits_{g = 1}^G \Pr {\big( {{C_{ig}} = {X_g};\theta ,\sigma ,{X_g},{Y_g}} \big)^{I( {{C_{ig}} = {X_g}} )}}\nonumber\\ &&\times \,\Pr {\big( {{C_{ig}} = {Y_g};\theta ,\sigma ,{X_g},{Y_g}} \big)^{1 - I( {{C_{ig}} = {X_g}} )}},\end{eqnarray} (5) where the indicator I(Cig = Xg) equals 1 if the subject chooses allocation Xg and 0 otherwise.13 3.2. Aggregate Estimation The first version of the random utility model pools the data and estimates aggregate parameters, (θ, σ), that are representative for all subjects. These aggregate estimates represent the most parsimonious characterization of social preferences. They are useful mainly for comparisons with the existing literature, such as Charness and Rabin (2002) or Engelmann and Strobel (2004). However, since the aggregate estimates completely neglect heterogeneity they may fit the data only poorly and neglect important behavioral regularities that characterize nonnegligible minorities among the subjects. 3.3. Finite Mixture Estimation The second version takes individual heterogeneity into account and estimates finite mixture models. Finite mixture models are flexible enough to take the most important aspects of heterogeneity into account, namely the existence of distinct preference types. But on the other hand, they remain relatively parsimonious, as they require much less parameters than estimations at the individual level. Finite mixture models assume that the population is made up by a finite number of K distinct preference types, each characterized by its own set of parameters, (θk, σk). This assumption of distinctly different preference types implies latent heterogeneity in the data, since each subject belongs to one of the K types, but individual type-membership is not directly observable. Consequently, a given subject i’s likelihood contribution depends on the whole parameter vector of the finite mixture model, Ψ = (θ1, …, θK, σ1, …, σK, π1, …, πK − 1), and corresponds to \begin{equation} \ell\big( {{ {\Psi }};X,Y,{C_i}} \big) = \mathop \sum \limits_{k = 1}^K {\pi _k}f\big( {{\theta _k},{\sigma _k};X,Y,{C_i}} \big).\end{equation} (6) It equals the sum of all type-specific conditional densities, f(θk, σk; X, Y, Ci), weighted by the ex ante probability, πk, that subject i belongs to the corresponding preference type k. Since individual type-membership cannot be observed directly, the unknown probabilities πk are ex ante the same for all subjects and equal to the preference types’ shares in the population. The parameter vector Ψ = (θ1, …, θK, σ1, …, σK, π1, …, πK−1) consists of K type-specific sets of parameters reflecting the types’ preferences and choice sensitivities as well as K − 1 parameters reflecting the types’ shares in the population. Thus, estimating a finite mixture model results in a parsimonious characterization of the K types by their type-specific preference parameters and their shares in the population.14 Once we estimated the parameters of the finite mixture model, we can endogenously classify each subject into the preference type that best describes her behavior. Given the fitted parameters, $${\widehat{\Psi }}$$, any subject i’s ex post probabilities of individual type-membership, \begin{equation}{\tau _{ik}} = \frac{{{{\hat{\pi }}_k}f\big( {{{\hat{\theta }}_k},{{\hat{\sigma }}_k};X,Y,{C_i}} \big)}}{{\mathop \sum \nolimits_{m = 1}^K {{\hat{\pi }}_m}f\big( {{{\hat{\theta }}_m},{{\hat{\sigma }}_m};X,Y,{C_i}} \big)}} ,\end{equation} (7) follow from Bayes’ rule. These ex-post probabilities of individual type-membership directly yield the preference type the subject most likely stems from. An important aspect of estimating a finite mixture model is to find the appropriate number of preference types K that represent a compromise between flexibility and parsimony. If K is too small, the model lacks the flexibility to cope with the heterogeneity in the data and may disregard minority types. If K is too large, on the other hand, the model is overspecified and tries to capture types that do not exist. Such an overspecified model results in considerable overlap between the estimated preference types and an ambiguous classification of subjects into types. In either case, the stability and predictive power of the model's estimates are likely compromised. Unfortunately, there is no general single best strategy for determining the optimal number of types in a finite mixture model. Due to the nonlinearity of any finite mixture model's likelihood function there exists no statistical test for determining K that exhibits a test statistic with a known distribution (McLachlan and Peel 2000).15 Furthermore, classical model selection criteria, such as the Akaike information criterion (AIC) or the Bayesian information criterion (BIC), are known to perform badly in the context of finite mixture models. The AIC is order inconsistent and therefore tends to overestimate the optimal number of types (Atkinson 1981; Geweke and Meese 1981; Celeux and Soromenho 1996). The BIC is consistent under suitable regularity conditions, but still shows weak performance in simulations when being applied as a tool for determining K (Biernacki et al. 2000). But in any case, the classification of subjects into preference types should be unambiguous in the sense that τ$$ik$$ is either close to zero or close to 1, and the estimated type-specific parameters should be stable over time. We apply the normalized entropy criterion (NEC) to summarize the ambiguity in the individual classification of subjects into preference types (Celeux and Soromenho 1996; Biernacki et al. 1999). The NEC allows us to select the finite mixture model with K > 1 types that yields the cleanest possible classification of subjects into types relative to its fit. The NEC for K preference types, \begin{equation}{\rm{NEC}} ( K ) = \frac{{E( K )}}{{L( K ) - L( 1 )}} ,\end{equation} (8) is based on the entropy, \begin{equation}E ( K ) = - \mathop \sum \limits_{k = 1}^K \mathop \sum \limits_{i = 1}^N {\tau _{ik}}\ln {\tau _{ik}} \ge 0,\end{equation} (9) normalized by the difference in the log likelihood between the finite mixture model with K types, L(K), and the aggregate model, L(1). The entropy, E(K), quantifies the ambiguity in the ex post probabilities of type-membership, τ$$ik$$. If all τ$$ik$$ are either close to 1 or close to 0, meaning that each subject is classified unambiguously into exactly one behavioral type, E(K) is close to 0. But if many τ$$ik$$ are close to 1/K, indicating that many subjects cannot be cleanly assigned to one type, E(K) is large. One disadvantage of the NEC is that it is not defined in case of K = 1. Hence, the NEC cannot be used to discriminate between the aggregate model with K = 1 and the best performing finite mixture model with K > 1 types. Consequently, we apply the following strategy to determine the optimal number of types in our estimations. First, we begin with the aggregate model and closely inspect its fit to the data. If we find major behavioral regularities that the aggregate model cannot explain, we treat this as an indication of potential heterogeneity and estimate finite mixture models with a varying number of types. An example of such a major behavioral regularity would be if the estimate of the representative agent's preferences imply that she is altruistic, and hence will never reduce other subjects’ payoff, but we observe nevertheless a substantial share of subjects that in fact reduces the other players’ payoff. This would suggest a heterogeneous population with a majority of subjects motivated by altruism and a minority of subjects motivated by, for example, behindness aversion or negative reciprocity. When estimating finite mixture models to take such heterogeneity into account, we opt for the number of preference types K that minimizes the NEC and yields the cleanest segregation of subjects into types relative to the fit of the model. Finally, we examine whether the type-specific estimates of the behavioral parameters θk are stable over time. 3.4. Individual Estimations Finally, the third version of the random utility model estimates the parameters, (θi, σi), separately for each subject. The resulting individual estimates reveal the full extent of behavioral heterogeneity in the data. However, they lack parsimony and likely suffer from small sample bias. Furthermore, Monte Carlo simulations indicate that the individual estimates tend to be strongly biased if the errors in subjects’ utilities are serially correlated (see Section B.1 in the Online Supplementary Data). Thus, we expect them to be less stable over time than the aggregate estimates and the finite mixture models’ type-specific estimates. Moreover, a researcher interested in developing a parsimonious theoretical model with different social preference types may find it hard to infer the general behavioral patterns from a plethora of individual estimates. 4. Results A key purpose of our study is to provide a characterization of the distribution of social preferences that is (i) parsimonious, (ii) captures the major qualitative regularities of the data, (iii) displays reasonable levels of stability over time, and (iv) is capable of predicting behavior out-of-sample in other games. To achieve this purpose we proceed as follows. First, we estimate the preference parameters of a representative agent and examine how well these parameters capture the various aspects of our data. Clearly, the representative agent model is the most parsimonious one but—as we will see in what follows—it misses important behavioral regularities that are likely driven by a minority of subjects. Second, we estimate the parameters of a finite mixture model that allows for a small number of types without imposing ex ante restrictions on the qualitative properties of the types. Third, we estimate the preference parameters for each individual separately thus allowing that each individual is its own preference type. We had to exclude 14 of the 174 subjects from the sample because they behaved very inconsistently. These 14 subjects switched several times between the allocations X and Y within a given circle of the experimental design. In other words, they reversed their preferences for the other player's payoff several times when the cost of doing so rose monotonically. Consequently, it is not possible to estimate the individual preferences of these 14 subjects. Their estimated choice sensitivity $$\hat{\sigma }$$ is close to 0, indicating an abysmal fit of the empirical model. With $$\hat{\sigma }$$ almost 0, the preference parameters are no longer identified and at least one of their estimates lies outside the identifiable range of − 3 to 1. Hence, we dropped these 14 subjects and report all following results for the remaining 160 subjects.16 4.1. Preferences of the Representative Agent Table 1 presents the parameter estimates $$( {\hat {\theta},\,\hat{\sigma }} )$$ of the aggregate model that are representative for all subjects. The estimates indicate that the distributional preference parameters, α and β, are important for aggregate behavior. In both sessions 1 and 2 the representative agent values the payoff of others positively ($$\hat{\alpha } > 0$$ and $$\hat{\beta } > 0$$) regardless of whether the other player is better or worse off. However, the valuation of the other player's payoff is much higher when ahead than when behind, implying that the representative agent displays asymmetric altruism. More, specifically, in session 1 (2), the estimate of α equals 0.083 (0.098) whereas the estimate of β is much bigger and amounts to 0.261 (0.245). Thus, the weight of the other player's payoff is almost three times as high in situations of advantageous inequality than in situations of disadvantageous inequality (z-tests with H0: α = β yield a p-value < 0.001 in both sessions). In terms of the willingness to pay, these numbers imply that the representative agent is willing to pay approximately 33 cents to increase the other player's payoff by $1 when ahead whereas when behind he is only willing to pay approximately 10.5 cents.17 Table 1. Estimated preferences of the representative agent (K = 1) in sessions 1 and 2. Estimates of session 1 Estimates of session 2 p-value of z-test with H0: session 1 = session 2 α: Weight on other's payoff when behind 0.083*** 0.098*** 0.468 (0.015) (0.013) β: Weight on other's payoff when ahead 0.261*** 0.245*** 0.551 (0.019) (0.019) γ: Measure of positive reciprocity 0.072*** 0.029*** 0.010 (0.014) (0.010) δ: Measure of negative reciprocity –0.042*** –0.043*** 0.918 (0.011) (0.008) σ: Choice sensitivity 0.016*** 0.019*** 0.006 (0.001) (0.001) Number of observations 18,720 18,720 Number of subjects 160 160 Log likelihood –5472.31 –4540.74 Estimates of session 1 Estimates of session 2 p-value of z-test with H0: session 1 = session 2 α: Weight on other's payoff when behind 0.083*** 0.098*** 0.468 (0.015) (0.013) β: Weight on other's payoff when ahead 0.261*** 0.245*** 0.551 (0.019) (0.019) γ: Measure of positive reciprocity 0.072*** 0.029*** 0.010 (0.014) (0.010) δ: Measure of negative reciprocity –0.042*** –0.043*** 0.918 (0.011) (0.008) σ: Choice sensitivity 0.016*** 0.019*** 0.006 (0.001) (0.001) Number of observations 18,720 18,720 Number of subjects 160 160 Log likelihood –5472.31 –4540.74 Notes: Individual cluster robust standard errors in parentheses. ***Significant at 1%. View Large Table 1. Estimated preferences of the representative agent (K = 1) in sessions 1 and 2. Estimates of session 1 Estimates of session 2 p-value of z-test with H0: session 1 = session 2 α: Weight on other's payoff when behind 0.083*** 0.098*** 0.468 (0.015) (0.013) β: Weight on other's payoff when ahead 0.261*** 0.245*** 0.551 (0.019) (0.019) γ: Measure of positive reciprocity 0.072*** 0.029*** 0.010 (0.014) (0.010) δ: Measure of negative reciprocity –0.042*** –0.043*** 0.918 (0.011) (0.008) σ: Choice sensitivity 0.016*** 0.019*** 0.006 (0.001) (0.001) Number of observations 18,720 18,720 Number of subjects 160 160 Log likelihood –5472.31 –4540.74 Estimates of session 1 Estimates of session 2 p-value of z-test with H0: session 1 = session 2 α: Weight on other's payoff when behind 0.083*** 0.098*** 0.468 (0.015) (0.013) β: Weight on other's payoff when ahead 0.261*** 0.245*** 0.551 (0.019) (0.019) γ: Measure of positive reciprocity 0.072*** 0.029*** 0.010 (0.014) (0.010) δ: Measure of negative reciprocity –0.042*** –0.043*** 0.918 (0.011) (0.008) σ: Choice sensitivity 0.016*** 0.019*** 0.006 (0.001) (0.001) Number of observations 18,720 18,720 Number of subjects 160 160 Log likelihood –5472.31 –4540.74 Notes: Individual cluster robust standard errors in parentheses. ***Significant at 1%. View Large The estimates of the reciprocity parameters γ and δ imply that the subjects’ preferences are on average somewhat reciprocal. Kind acts increase the weight of the other player's payoff ($$\hat{\gamma }$$ > 0), whereas unkind acts decrease the weight of the other player's payoff ($$\hat{\delta }$$ < 0). However, the magnitude of the estimated reciprocity parameters is small, suggesting that both positive and negative reciprocity play a less important role than distributional preferences. Moreover, although there seems to be a consensus in the literature that negative reciprocity is more important than positive reciprocity18 the preference estimates of the representative agent model do not support this. In fact, the parameter for positive reciprocity is even higher than the one for negative reciprocity (z-tests with H0: γ = δ yield a p-value < 0.001 in both sessions). In sum, as in Charness and Rabin (2002), the estimates of the aggregate model clearly reject the hypothesis that in our sample the representative agent is exclusively motivated by selfishness. In fact, the representative agent shows a substantial concern for others’ payoff. The last column of Table 1 shows that aggregate behavior is also rather stable over time. The parameter estimates of α, β, and δ are clearly not significantly different between the sessions 1 and 2. Only the estimates for positive reciprocity, $$\hat{\gamma }$$, and the choice sensitivity, $$\hat{\sigma }$$, differ significantly between the two sessions. The significant decline in $$\hat{\gamma }$$ across sessions suggests that positive reciprocity is a more fragile preference component compared to the other components. Note that this instability of the reciprocity parameter cannot be attributed to attrition bias because the estimates in Table 1 are based on the behavior of the same subjects in the two sessions. In addition, we find no evidence for attrition bias. The session 1 estimates in the sample of all subjects who participated in that session are statistically indistinguishable from the session 1 estimates in the subsample of the 160 subjects who participated in both sessions (see Table A.2 in the Appendix). How well do the preference parameters of the representative agent fit the aggregate data, and are there any unexplained behavioral regularities? In Figure 2, the solid lines represent the subjects’ empirical willingness to change the other player's payoff at a given cost in session 1, whereas the dashed lines correspond to their predicted willingness to change the other player's payoff.19 The predictions are based on the random utility model discussed in Section 3.1 and use all dictator and reciprocity games. The panels on the left and right show the share of subjects willing to increase and decrease the other player's payoff, respectively. The upper panels describe situations of disadvantageous inequality, whereas the lower panels describe situations of advantageous inequality. At first glance the aggregate model fits the data well, as the empirical and predicted shares of subjects willing to change the other's payoff almost coincide. In particular, the lower-left and upper-left panels show that the share of subjects increasing the other's payoff is higher in situations of advantageous than disadvantageous inequality. For example, at a cost of 0.39, more than 40% of the subjects are willing to increase the other player's payoff when ahead but less than 20% are willing to do so when behind. This is in line with the estimates $$\hat{\beta } > \hat{\alpha }$$ of the aggregate model. There are, however, important behavioral regularities that the representative agent model fails to explain. The right panels of Figure 2 indicate that there exists a minority of subjects who decrease the other player's payoff even at a cost, especially when they are behind. If all individuals would have qualitatively similar preferences as the representative agent, that is, if all of them had a positive valuation of others’ payoff (α > 0 and β > 0) there should be nobody who decreases the other player's payoff. In fact, however, the right panels show that up to 20% of the subjects decrease other's payoff. The aggregate model “neglects” these subjects in the sense that it assigns a positive α and a positive β to the representative agent because the share of subjects who increase the other's payoff at a given cost level is larger than the share of subjects that decreases the other's payoff (compare right to left panels). However, understanding the behavior of subjects that decrease the other player's payoff can be crucial for predicting aggregate outcomes even if these subjects constitute only a minority. For example, in ultimatum games or public goods games with punishment, even a minority of subjects who are willing to reject unfair offers or punish freeriding can discipline a majority of selfish players and entirely determine the aggregate outcome (Fehr and Schmidt 1999). But the aggregate model absorbs the behavior of these subjects in the random utility component, as it is not flexible enough to take minorities of subjects into account whose preference parameters systematically differ from those of the majority. 4.2. A Parsimonious Model of Preference Types In view of the relevance of the existence of minority types for aggregate outcomes it is important to be able to characterize the heterogeneity of preferences of suitably defined subpopulations. In addition, we need to be able to characterize the preferences of these subgroups because this provides insights into their potential role in social interactions. For example, it is important to know whether a subgroup values the payoffs of others generally negatively—which would define them as spiteful types—or whether they only value the payoffs of others negatively when behind or treated unkindly.20 However, the a priori definition of subgroups or preference types is always associated with some arbitrariness and the danger that the predefined groups or preferences characteristics of the group do not do justice to the data. Therefore, we apply an approach that simultaneously identifies (i) the preference characteristics of each type, (ii) the relative share of each type in the population, and (iii) the assignment of each individual to one of the preference types. The finite mixture approach we use in this section fits this bill. To apply this approach we need to specify a priori the number of distinct preference types we consider. To obtain a compromise between flexibility and parsimony, we choose the number of preference types, K, based on the NEC (see Section 3.3). Figure 3 shows the NEC’s value for K = 2, K = 3, and K = 4 preference types. In both sessions, the NEC favors a finite mixture model with K = 3 preference types providing the cleanest assignment of subjects to types. This clean assignment of subjects to types is also reflected by the distribution of the individual posterior probabilities of type-membership τ$$ik$$: almost all of them are either very close to 1 or 0, suggesting that almost all subjects are unambiguously assigned to one of the three preference types (for further details see Section A.7 and Figure A.4 in the Appendix). Figure 1. View largeDownload slide The dictator games. Each of the three circles contains 13 binary dictator games. Each game is represented by the two payoff allocations connected by a line. Player A can choose one of the extreme points on the line. For every game, the slope of the line indicates A’s cost of altering player B’s payoff. Allocations above (below) the dashed 45° line help identifying the weight player A puts on B’s payoff under disadvantageous (advantageous) inequality. Figure 1. View largeDownload slide The dictator games. Each of the three circles contains 13 binary dictator games. Each game is represented by the two payoff allocations connected by a line. Player A can choose one of the extreme points on the line. For every game, the slope of the line indicates A’s cost of altering player B’s payoff. Allocations above (below) the dashed 45° line help identifying the weight player A puts on B’s payoff under disadvantageous (advantageous) inequality. Figure 2. View largeDownload slide Representative agent's empirical and predicted willingness to change the other player's payoff across cost levels in session 1. The empirical willingness corresponds to the fraction of subjects that chose to change the other player's payoff in the indicated direction. The predicted willingness corresponds to the predicted probability that the representative agent changes the other player's payoff in the indicated direction. It is based on the random utility model presented in Section 3.1 and uses the estimated aggregate parameters of session 1 on all dictator and reciprocity games. Figure 2. View largeDownload slide Representative agent's empirical and predicted willingness to change the other player's payoff across cost levels in session 1. The empirical willingness corresponds to the fraction of subjects that chose to change the other player's payoff in the indicated direction. The predicted willingness corresponds to the predicted probability that the representative agent changes the other player's payoff in the indicated direction. It is based on the random utility model presented in Section 3.1 and uses the estimated aggregate parameters of session 1 on all dictator and reciprocity games. Figure 3. View largeDownload slide Normalized entropy criterion (NEC) for different numbers of preference types in sessions 1 and 2. The NEC summarizes the ambiguity in the subjects’ classification into types relative to the finite mixture model's improvement in fit compared to the representative agent model with K = 1 (see equations (8) and (9)). By minimizing the NEC, we can determine the optimal number of preference types K the finite mixture model should take into account. Figure 3. View largeDownload slide Normalized entropy criterion (NEC) for different numbers of preference types in sessions 1 and 2. The NEC summarizes the ambiguity in the subjects’ classification into types relative to the finite mixture model's improvement in fit compared to the representative agent model with K = 1 (see equations (8) and (9)). By minimizing the NEC, we can determine the optimal number of preference types K the finite mixture model should take into account. Furthermore, when judging the appropriateness of the assumed number of types, we also examine in what follows whether qualitatively new types emerge if one increases K or whether an increase in K is just associated with splitting up a given type while maintaining the sign of the various preference parameters. Finally, a further desirable feature when judging the appropriateness of the assumed number of types is that the preference characteristics of the different types should be relatively stable across time. Table 2 reports the results of our finite mixture estimates for both sessions. As in the case of the representative agent, the estimates of the parameters that capture outcome-based distributional preferences, $$\hat{\alpha }$$ and $$\hat{\beta }$$, are generally much higher than the reciprocity parameters, $$\hat{\gamma }$$ and $$\hat{\delta }$$. For this reason, we characterize the different types according to their distributional preference parameters. The table shows the existence of (i) a moderately altruistic (MA) type, (ii) a strongly altruistic (SA) type, and (iii) of a behindness averse (BA) type. A remarkable feature of all three types is that they value the payoff of others’ much more when they are ahead than when behind. For this reason, one may also speak of moderate (asymmetric) altruists and strong (asymmetric) altruists. Another remarkable feature of Table 2 is that a purely selfish type, that puts zero value on others’ payoffs, does not exist. All types display positive or negative valuations of others’ payoffs.21 Table 2. Finite mixture estimations (K = 3) in sessions 1 and 2. Strongly altruistic type Moderately altruistic type Behindness averse type Session 1 π: Types’ shares in the population 0.405*** 0.474*** 0.121*** (0.047) (0.042) (0.039) α: Weight on other's payoff when behind 0.159*** 0.065*** –0.437*** (0.036) (0.013) (0.130) β: Weight on other's payoff when ahead 0.463*** 0.130*** –0.147 (0.028) (0.017) (0.147) γ: Measure of positive reciprocity 0.151*** –0.001 0.170 (0.026) (0.012) (0.119) δ: Measure of negative reciprocity –0.053** –0.027** –0.077 (0.025) (0.012) (0.162) σ: Choice sensitivity 0.018*** 0.032*** 0.008*** (0.001) (0.002) (0.002) Session 2 π: Types’ shares in the population 0.356*** 0.544*** 0.100*** (0.039) (0.041) (0.024) α: Weight on other's payoff when behind 0.193*** 0.061*** –0.328*** (0.019) (0.009) (0.073) β: Weight on other's payoff when ahead 0.494*** 0.095*** –0.048 (0.020) (0.012) (0.053) γ: Effect of positive reciprocity 0.099*** –0.005 –0.028 (0.024) (0.006) (0.030) δ: Effect of negative reciprocity –0.082*** –0.019*** –0.015 (0.018) (0.007) (0.035) σ: Choice sensitivity 0.019*** 0.049*** 0.015*** (0.001) (0.004) (0.002) Number of observations (both sessions) 18,720 Number of subjects (both sessions) 160 Log likelihood in session 1 −4202.17 Log likelihood in session 2 −3166.32 Strongly altruistic type Moderately altruistic type Behindness averse type Session 1 π: Types’ shares in the population 0.405*** 0.474*** 0.121*** (0.047) (0.042) (0.039) α: Weight on other's payoff when behind 0.159*** 0.065*** –0.437*** (0.036) (0.013) (0.130) β: Weight on other's payoff when ahead 0.463*** 0.130*** –0.147 (0.028) (0.017) (0.147) γ: Measure of positive reciprocity 0.151*** –0.001 0.170 (0.026) (0.012) (0.119) δ: Measure of negative reciprocity –0.053** –0.027** –0.077 (0.025) (0.012) (0.162) σ: Choice sensitivity 0.018*** 0.032*** 0.008*** (0.001) (0.002) (0.002) Session 2 π: Types’ shares in the population 0.356*** 0.544*** 0.100*** (0.039) (0.041) (0.024) α: Weight on other's payoff when behind 0.193*** 0.061*** –0.328*** (0.019) (0.009) (0.073) β: Weight on other's payoff when ahead 0.494*** 0.095*** –0.048 (0.020) (0.012) (0.053) γ: Effect of positive reciprocity 0.099*** –0.005 –0.028 (0.024) (0.006) (0.030) δ: Effect of negative reciprocity –0.082*** –0.019*** –0.015 (0.018) (0.007) (0.035) σ: Choice sensitivity 0.019*** 0.049*** 0.015*** (0.001) (0.004) (0.002) Number of observations (both sessions) 18,720 Number of subjects (both sessions) 160 Log likelihood in session 1 −4202.17 Log likelihood in session 2 −3166.32 Notes: Individual cluster robust standard errors in parentheses. **Significant at 5%; ***significant at 1%. View Large Table 2. Finite mixture estimations (K = 3) in sessions 1 and 2. Strongly altruistic type Moderately altruistic type Behindness averse type Session 1 π: Types’ shares in the population 0.405*** 0.474*** 0.121*** (0.047) (0.042) (0.039) α: Weight on other's payoff when behind 0.159*** 0.065*** –0.437*** (0.036) (0.013) (0.130) β: Weight on other's payoff when ahead 0.463*** 0.130*** –0.147 (0.028) (0.017) (0.147) γ: Measure of positive reciprocity 0.151*** –0.001 0.170 (0.026) (0.012) (0.119) δ: Measure of negative reciprocity –0.053** –0.027** –0.077 (0.025) (0.012) (0.162) σ: Choice sensitivity 0.018*** 0.032*** 0.008*** (0.001) (0.002) (0.002) Session 2 π: Types’ shares in the population 0.356*** 0.544*** 0.100*** (0.039) (0.041) (0.024) α: Weight on other's payoff when behind 0.193*** 0.061*** –0.328*** (0.019) (0.009) (0.073) β: Weight on other's payoff when ahead 0.494*** 0.095*** –0.048 (0.020) (0.012) (0.053) γ: Effect of positive reciprocity 0.099*** –0.005 –0.028 (0.024) (0.006) (0.030) δ: Effect of negative reciprocity –0.082*** –0.019*** –0.015 (0.018) (0.007) (0.035) σ: Choice sensitivity 0.019*** 0.049*** 0.015*** (0.001) (0.004) (0.002) Number of observations (both sessions) 18,720 Number of subjects (both sessions) 160 Log likelihood in session 1 −4202.17 Log likelihood in session 2 −3166.32 Strongly altruistic type Moderately altruistic type Behindness averse type Session 1 π: Types’ shares in the population 0.405*** 0.474*** 0.121*** (0.047) (0.042) (0.039) α: Weight on other's payoff when behind 0.159*** 0.065*** –0.437*** (0.036) (0.013) (0.130) β: Weight on other's payoff when ahead 0.463*** 0.130*** –0.147 (0.028) (0.017) (0.147) γ: Measure of positive reciprocity 0.151*** –0.001 0.170 (0.026) (0.012) (0.119) δ: Measure of negative reciprocity –0.053** –0.027** –0.077 (0.025) (0.012) (0.162) σ: Choice sensitivity 0.018*** 0.032*** 0.008*** (0.001) (0.002) (0.002) Session 2 π: Types’ shares in the population 0.356*** 0.544*** 0.100*** (0.039) (0.041) (0.024) α: Weight on other's payoff when behind 0.193*** 0.061*** –0.328*** (0.019) (0.009) (0.073) β: Weight on other's payoff when ahead 0.494*** 0.095*** –0.048 (0.020) (0.012) (0.053) γ: Effect of positive reciprocity 0.099*** –0.005 –0.028 (0.024) (0.006) (0.030) δ: Effect of negative reciprocity –0.082*** –0.019*** –0.015 (0.018) (0.007) (0.035) σ: Choice sensitivity 0.019*** 0.049*** 0.015*** (0.001) (0.004) (0.002) Number of observations (both sessions) 18,720 Number of subjects (both sessions) 160 Log likelihood in session 1 −4202.17 Log likelihood in session 2 −3166.32 Notes: Individual cluster robust standard errors in parentheses. **Significant at 5%; ***significant at 1%. View Large The MA-type makes up roughly 50% of the population and puts positive but modest weight on the other player's payoff, regardless of whether they are ahead or behind. This type also displays basically no positive reciprocity but moderate levels of negative reciprocity. The distributional preferences of the MA-type (inferred from session 1) implies that members of this group are on average willing to spend 15 cents to increase the other player's payoffs by $1 when ahead and 7 cents when behind. Thus, the MA-types are willing to behave altruistically when the cost is relatively low. Note that the identification of this type crucially relies on our experimental design's power to reliably discriminate between purely selfish preferences and moderately strong social preferences. The SA-type roughly comprises between 35% and 40% of the population. Subjects in this group display a valuation of the other player's payoff that is two to three times larger than that of the MA-type. The strong altruists also show relatively high levels of positive reciprocity and somewhat lower levels of negative reciprocity. Based on their distributional preferences (in session 1) the SA-type is willing to spend 86 cents to increase other player's payoff by $1 when ahead and 19 cents when behind. Moreover, if a strong altruist has been treated kindly, such that the positive reciprocity parameter becomes relevant, the willingness to increase the other's payoff increases to 159 cents when ahead and 45 cents when behind.22 Thus, this group indeed displays rather strong social preferences. Finally, the BA-type comprises roughly 10% of the population and weighs the other player's payoff negatively in situations of disadvantageous inequality. Interestingly, this type also tends to value others’ payoffs negatively when ahead but the relevant preference parameter β is not significantly different from zero. This type also displays no significant preferences for positive or negative reciprocity. However, the behindness averse component of the BA-type is rather strong: they are on average willing to spend 78 cents to decrease the other player's payoff by $1 when behind. How do our results relate to the existing literature? Fisman et al. (2007), for example, use step-shaped budget sets to identify the relative proportions of four different predefined types: they find a lexself type (∼49% of the subject pool) and a difference averse type (∼17%). Their social welfare type (∼13%) is most similar to our strongly altruistic type, but they find a lower population proportion of this type than we do, and their selfish and competitive type (∼19%) resembles somewhat our behindness averse type. Kerschbamer (2015) relies on a piecewise linear model and a design that uses a geometric delineation of preferences in the context of a two person dictator game approach. His model allows for a translation of his detected type classification into ours. In doing so, we find that out of the 92 subjects in his sample of Austrian students, roughly about 30% are moderately altruistic, 32% are strongly altruistic, and 5% are behindness averse—indicating that the relative population proportions of these three types is remarkably similar to the proportions we found. Kerschbamer's classification is much more flexible than ours but this comes at the cost of lower parsimony, and a classification that is not unique in all cases, such that the reported population proportions can add up to a total of more than 100%, depending on what specific classification is used.23 Another study in this vein is Irriberi and Rey-Biel (2011), who elicit distributional preferences with a series of modified three-option dictator games with and without role uncertainty and who also use a finite mixture method to assign subjects into four predefined types. Based on their design without role uncertainty—which is closer to ours—they find 25% of their subjects to be of the inequity averse type. Twenty-two percent are social welfare types (who resemble our SA-type) and 10% are called competitive (those subjects’ behavior corresponds to our BA-type). Forty-four percent of their subjects are assigned to the predefined selfish type, which—in terms of model parameter distance—is most similar to our weakly altruistic type. A similar picture emerges in Irriberi and Rey-Biel (2013) where a similar procedure is used. In this paper, they compare behavior in situations with social-information about others’ behavior to situations without social information. In the latter, which is most similar to our design, they find 15% inequity averse types, 14% social welfare types (strongly altruistic in our design), 17% competitive types (behindness averse in our design) as well as 54% selfish types. Finally, the study by Andreoni and Miller (2002) distinguishes between selfish, Leontief and perfect substitutes preferences. They find 47% selfish types. Overall, these comparisons illustrate that one important aspect of our endogenous classification procedure is the emergence of MA-types. Many previous studies relied on predefined types and applied experimental designs less focused on discriminating between purely selfish preferences and moderately strong social preferences. This may be the reason why they could not identify the low-cost altruism that characterizes our MA-types and labeled these subjects as purely selfish types instead. Overall, however, these studies also provided evidence for a substantial fraction of SA-types and a relatively low share of BA-types, because these types were contained in the set of predefined types and their identification does not rely on an experimental design with specifically high power to discriminate between purely selfish preferences and moderately strong social preferences. 4.3. Stability and Fit of the Preference Types One desirable characteristic of a parsimonious distribution of types is that the preferences of the different types as well as their shares in the population remain stable over time. We can address this issue by comparing the relevant parameter estimates between sessions 1 and 2. Table 3 depicts the result of such comparisons by showing the p-values of various Wald tests for the finite mixture models with K = 2, 3, and K = 4 preference types. The first six rows test parameter by parameter whether the corresponding estimates remain stable over time for all K types. The last two rows test jointly whether the corresponding set of parameter estimates is stable over time for all K types. Table 3. Wald tests for the stability of parameter estimates over time, that is, over sessions 1 and 2. Stable under H0 p-value in model Null hypothesis H0 πk αk βk γk δk σk K = 2 K = 3 K = 4 Types’ shares are stable x 0.067 0.498 <0.001 Weights on other's payoff when behind are stable x <0.001 0.762 <0.001 Weights on other's payoff when ahead are stable x <0.001 0.208 <0.001 Positive reciprocity parameters are stable x <0.001 0.089 <0.001 Negative reciprocity parameters are stable x 0.010 0.765 0.810 Choice sensitivity parameters are stable x <0.001 0.001 <0.001 All preference parameters and types’ shares are stable x x x x x <0.001 0.009 <0.001 All pref. pars. and types’ rel. sizes excl. positive reciprocity are stable x x x x <0.001 0.490 <0.001 Stable under H0 p-value in model Null hypothesis H0 πk αk βk γk δk σk K = 2 K = 3 K = 4 Types’ shares are stable x 0.067 0.498 <0.001 Weights on other's payoff when behind are stable x <0.001 0.762 <0.001 Weights on other's payoff when ahead are stable x <0.001 0.208 <0.001 Positive reciprocity parameters are stable x <0.001 0.089 <0.001 Negative reciprocity parameters are stable x 0.010 0.765 0.810 Choice sensitivity parameters are stable x <0.001 0.001 <0.001 All preference parameters and types’ shares are stable x x x x x <0.001 0.009 <0.001 All pref. pars. and types’ rel. sizes excl. positive reciprocity are stable x x x x <0.001 0.490 <0.001 View Large Table 3. Wald tests for the stability of parameter estimates over time, that is, over sessions 1 and 2. Stable under H0 p-value in model Null hypothesis H0 πk αk βk γk δk σk K = 2 K = 3 K = 4 Types’ shares are stable x 0.067 0.498 <0.001 Weights on other's payoff when behind are stable x <0.001 0.762 <0.001 Weights on other's payoff when ahead are stable x <0.001 0.208 <0.001 Positive reciprocity parameters are stable x <0.001 0.089 <0.001 Negative reciprocity parameters are stable x 0.010 0.765 0.810 Choice sensitivity parameters are stable x <0.001 0.001 <0.001 All preference parameters and types’ shares are stable x x x x x <0.001 0.009 <0.001 All pref. pars. and types’ rel. sizes excl. positive reciprocity are stable x x x x <0.001 0.490 <0.001 Stable under H0 p-value in model Null hypothesis H0 πk αk βk γk δk σk K = 2 K = 3 K = 4 Types’ shares are stable x 0.067 0.498 <0.001 Weights on other's payoff when behind are stable x <0.001 0.762 <0.001 Weights on other's payoff when ahead are stable x <0.001 0.208 <0.001 Positive reciprocity parameters are stable x <0.001 0.089 <0.001 Negative reciprocity parameters are stable x 0.010 0.765 0.810 Choice sensitivity parameters are stable x <0.001 0.001 <0.001 All preference parameters and types’ shares are stable x x x x x <0.001 0.009 <0.001 All pref. pars. and types’ rel. sizes excl. positive reciprocity are stable x x x x <0.001 0.490 <0.001 View Large The preference estimates of the finite mixture model with K = 3 types are remarkably stable over time. The first five rows of Table 3 reveal that the differences between sessions 1 and 2 are statistically insignificant at the 5% level for the types’ relative shares and all preference parameters when tested individually. As in the aggregate model, the estimates for positive reciprocity are the least stable, since their difference across sessions exhibits a p-value of 8.9%. By comparison, the differences across sessions of the types’ relative shares and all other preference parameters exhibit p-values above 20%. Furthermore, the results of the joint tests in the last two rows of Table 3 show that, once we exclude the estimates for positive reciprocity, the types’ relative shares and the other preference parameters remain jointly stable over time. In contrast, the parameter estimates of the models with K = 2 and K = 4 types vary significantly over time, both individually and jointly, indicating that these models are misspecified. In addition, the model with K = 2 types lacks the flexibility to capture the minority of BA-types (see also Table A.3 in the Appendix), whereas the model with K = 4 types overfits the data as it tries to isolate a second moderately altruistic type that is not stable over time (see also Table A.4 in the Appendix). Hence, the finite mixture model with K = 3 preference types not only represents the best compromise between flexibility and parsimony but also yields the most temporally stable characterization of social preferences. Figure 4 illustrates the temporal stability of the type-specific preference parameters in the (α, β)-space. The MA-type's parameter estimates are represented by squares, the SA-type's by diamonds, and the BA-type's by triangles. Note that for all three types, even for the very precisely estimated MA- and BA-types, the 95% confidence intervals for the estimates of sessions 1 and 2 overlap, indicating preference stability at the type level. Figure 4. View largeDownload slide Temporal stability of the type-specific parameter estimates of the finite mixture models with K = 3 preference types. The type-specific parameter estimates are stable over time as their 95% confidence intervals overlap between sessions 1 and 2. Figure 4. View largeDownload slide Temporal stability of the type-specific parameter estimates of the finite mixture models with K = 3 preference types. The type-specific parameter estimates are stable over time as their 95% confidence intervals overlap between sessions 1 and 2. Another way to look at the temporal stability of preference types is to analyze how the individual classification of subjects into types evolves over time. The finite mixture models we estimate provide not only a type-specific characterization of preferences but also posterior probabilities of individual type-membership, τ$$ik$$, for each subject (see equation (7)). Based on these individual posterior probabilities of type-membership, we can classify each subject into the type she most likely stems from. The transition matrix shown in Table 4 represents the resulting individual classification of subjects into types in sessions 1 and 2, respectively. It reveals that the three identified preference types are also fairly stable at the individual level: 121/160 = 76% of all subjects are located on the main diagonal and, thus, classified into the same preference type in both sessions. The assignment to the preference types is particularly stable for the MA- and the SA-types, where 84% and 74%, respectively, are assigned the same type in session 2 as in session 1. Table 4. Individual type-membership in sessions 1 and 2. Session 2 Moderately altruistic Strongly altruistic Behindness averse Session 1 Moderately altruistic (N = 76) 64 (84%) 7 (9%) 5 (7%) Strongly altruistic (N = 65) 15 (23%) 48 (74%) 2 (3%) Behindness averse (N = 19) 8 (42%) 2 (11%) 9 (47%) Session 2 Moderately altruistic Strongly altruistic Behindness averse Session 1 Moderately altruistic (N = 76) 64 (84%) 7 (9%) 5 (7%) Strongly altruistic (N = 65) 15 (23%) 48 (74%) 2 (3%) Behindness averse (N = 19) 8 (42%) 2 (11%) 9 (47%) Notes: The numbers in parentheses indicate the subjects’ transitions from session 1 to session 2 as a percentage of the original preference type in session 1. Subjects are classified into preference types according to the individual probabilities of type-membership (see Section 3.3) that are derived from the finite mixture estimations with K = 3 preference types. View Large Table 4. Individual type-membership in sessions 1 and 2. Session 2 Moderately altruistic Strongly altruistic Behindness averse Session 1 Moderately altruistic (N = 76) 64 (84%) 7 (9%) 5 (7%) Strongly altruistic (N = 65) 15 (23%) 48 (74%) 2 (3%) Behindness averse (N = 19) 8 (42%) 2 (11%) 9 (47%) Session 2 Moderately altruistic Strongly altruistic Behindness averse Session 1 Moderately altruistic (N = 76) 64 (84%) 7 (9%) 5 (7%) Strongly altruistic (N = 65) 15 (23%) 48 (74%) 2 (3%) Behindness averse (N = 19) 8 (42%) 2 (11%) 9 (47%) Notes: The numbers in parentheses indicate the subjects’ transitions from session 1 to session 2 as a percentage of the original preference type in session 1. Subjects are classified into preference types according to the individual probabilities of type-membership (see Section 3.3) that are derived from the finite mixture estimations with K = 3 preference types. View Large To what extent do the preference parameters estimated in the K = 3 model fit the empirical behavior of each of the three types? Figure 5 provides the answer to this question. It displays the empirical and predicted t