Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Mortgage Pricing and Race: Evidence from the Northeast

Mortgage Pricing and Race: Evidence from the Northeast Abstract The putative existence of race-based discrimination in mortgage pricing is both a scholarly and societal concern. Efforts to assess discrimination empirically, however, are typically plagued by omitted variables, which leave any evidence of discrimination open to interpretation. We take a two-pronged approach to the problem. First, we analyze a dataset comprising discretionary mortgage fees collected by brokers working for a brokerage company. Mortgage brokers are intermediaries between lenders and borrowers; they neither approve loans nor share in the risk of default. Variables that measure risk should therefore have no effect on these discretionary fees, and indeed, we show that default risk as measured by credit scores have no effect on discretionary pricing. Second, we perform a formal sensitivity analysis that quantifies the impact of potentially omitted variables. Our results suggest that minority borrowers pay more on average for mortgages than non-minorities, and that this effect persists even in the presence of unmeasured confounders. 1. Introduction In July of 2012, Wells Fargo Bank agreed to a $\$$175 million fine on the grounds that it had discriminated against African-American and Hispanic borrowers between 2004 and 2009 in violation of fair lending laws. The bank, while acquiescing to the penalty, nonetheless denied the government’s charges, claiming that it settled due to the prohibitive litigation costs involved. The bank later asserted that a corporate decision to cease providing mortgages through independent brokers was a completely independent choice. The Wells Fargo payment closely followed a $\$$335 million fine agreed to by Bank of America in late 2011. The Department of Justice complaint in this case alleged that more than 200,000 black and Hispanic borrowers paid more for loans than white borrowers because of race, not borrower risk.1 Despite these troubling charges, statistical evidence for racial discrimination in the mortgage industry is both scarce and of questionable quality.2 Determining the existence of discrimination in mortgage lending would be straightforward if the data comprised identically credit-worthy and sophisticated individuals applying for comparable loans to the same, or randomly assigned, brokers. Instead, scholars must rely on observational data and statistical controls, and the threat from omitted variable bias is ever present. Banks rarely make sensitive borrower information, such as credit scores, available to researchers, and scholars must make due with data that are far from ideal.3Kau et al. (2012), for example, rely on neighborhood racial and ethnic compositions because they lack actual data on the borrowers themselves. Neither of the existing studies of overages, Black et al. (2003) and Courchane and Nickerson (1997), includes credit scores as a control. Even the well-known dataset collected by the Federal Reserve of Boston lacks clean measures of applicant creditworthiness (for the best-known analysis of these data, see Munnell et al. (1996); for recent reanalyses, see Goenner (2010) and Han (2011)). We attack the omitted confounders problem by combining a new dataset with a formal sensitivity analysis. The dataset is unusual in two ways. First, the dataset concerns discretionary mortgage pricing by brokers who are intermediaries between lenders and borrowers. The brokers do not approve loans, a decision that is made by the lender, nor do they share in the risk, which is shouldered by the lender. Mortgage brokers are not compensated based on loan performance (whether a borrower repays a loan or defaults), and the fees they charge do not adjust for risk. Theoretically, we do not need to control for variables associated with default risk and that are routinely unobserved by researchers, for example, debt-to-loan ratio. That being said, risk adjustment remains a compelling alternative explanation whenever differential loan pricing is observed. We address the issue using the second unusual feature of the dataset: three credit scores for each borrower, one from each of the three major credit bureaus. As noted above, these data are quite rare. Despite the dataset’s unique nature, there remain variables that we would like to have, but do not have. Prominent among these are loan size, the financial sophistication of the borrower, and the borrower’s ability to negotiate. Our response is two-fold. First, we present evidence that borrowers rarely shop around for mortgages, a fact which severely limits the ability of a borrower to negotiate with her broker. Second, we employ a sophisticated sensitivity analysis that allows us to assess whether any additional unobserved variables exist that could change our results significantly. The results of the sensitivity analysis suggest that our findings are not sensitive to omitted confounders. Our empirical strategy unfolds in three stages. First, we use regression (overage is a continuous variable) to show that minority status is associated on average with larger overages, while credit scores (a measure of default risk) are unrelated to overages. Second, we report a number of robustness checks and demonstrate that our findings are consistent across many different specifications. These alternative specifications include quantile regression to address outliers, general additive models to check for nonlinearities, and matching estimators to assess model dependence. Third, we introduce and describe the results from the sensitivity analysis. Our conclusion is that a nontrivial amount of race-based discrimination exists in these data. The article proceeds as follows. In Section 2, we describe the problem of unobserved heterogeneity in discrimination research and discuss some recent attempts at solutions. Section 3 defines the roles of mortgage brokers, introduces the concept of an overage, and reviews past research. Section 4 describes our dataset. The empirical analysis is discussed in Section 5, and the sensitivity analysis in Section 6. Section 7 concludes with a brief discussion of discrimination law, and how the law relates to our results. 2. Discrimination and Unobserved Heterogeneity Omitted variables bedevil the study of discrimination in mortgage lending. As Ross and Yinger (2002, p. 108) note, the Boston Federal Reserve Study has more control variables than any previous study (an additional thirty eight), but was still widely attacked for leaving out important explanatory variables. Researchers studying discrimination in other areas experience similar problems and have developed two ways of dealing with missing variables. The first is known as audit or matched pair testing where two volunteers (or confederates) of different races attempt to get help at a retail outlet, obtain a loan, or be hired for a job. The key to this design is that the two volunteers must be alike in every way except race (or gender) to the store clerk, mortgage broker, or employer. Any difference in service or outcome may then be attributed to discrimination. Heckman (1998) criticizes audit tests for fragility in the face of assumptions regarding unobservable variables. No two people are alike in every discernible way except for race (or gender). Researchers have attempted to overcome this flaw by using correspondence tests, where the employer, for example, does not observe the matched applicants (Neumark, 2012). Heckman and Siegelman (1993), however, argue that discrimination is unidentified even in correspondence tests when the variance of unobserved productivity differs across groups. The experimental approach is necessarily limited in scope even if this problem could be addressed (Neumark (2012) proposes a test). A second method widely used by researchers is the outcome test first suggested by Becker (1993). Outcome tests consider the difference in average outcomes between two groups that have had a common experience (Glaser, 2014). If the difference is significant, we can infer that the two groups have been treated unequally. In Becker’s original formulation, if minorities default on loans at lower rates than non-minorities, there is evidence that minorities faced more stringent requirements than non-minorities to secure a loan (Glaser, 2014). Outcome tests are not susceptible to omitted variables bias because the variables that are unobserved by the researcher are observed by the loan officer or police officer, and under the hypothesis of no discrimination, are taken into account. Ayres (2001) criticizes outcome tests for what he terms the inframarginality problem. Discrimination is most likely to occur in marginal cases; well-qualifed borrowers receive loans, and nonqualifed borrowers do not (Glaser, 2014). In the mortgage context, marginal default rates are unobservable, and comparing averages confuses these marginal cases with the nonmarginal (inframarginal) cases (Ayres, 2001, p. 408). An exception exists when we expect an equilibrium to hold under the null hypothesis of no discrimination. In that case, no difference exists between marginal and nonmarginal cases.4 An outcome test applied to mortgage decisions would suffer from the inframarginality problem as there is no reason to assume equilibrium behavior. We do not, for example, expect to observe mortgage lenders targeting minorities to the exclusion of nonminorities with the expectation of higher profits. Neither approach discussed above has been widely used to study mortgage lending discrimination. Correspondence studies require online transactions, and nearly 50% of first-time homebuyers in 2014 met with a loan representative in person (Garrison, 2014).5 To our knowledge, the only correspondence study relating to mortgages is Hanson et al. (2016), who look at mortgage loan originators (MLO).6 They find an effect of being African-American on MLO response roughly equivalent to a credit score that is 50 points lower. MLOs also offer more details to whites and use friendlier language in email correspondence. These results are roughly consistent with our findings. Our data dictates our approach, which concerns fees charged by mortgage brokers who are not involved in the decision to approve or deny loans. In addition, they do not share in the risk involved in making loans. We therefore do not need to control for many of the variables associated with loan approval and default risk. On the other hand, we do need to control for variables related to loan size and bargaining. As we argue in the following sections, we have proxies for some of these variables, and we employ a sensitivity analysis to address the others. 3. The Nature of Overages Mortgage brokers are intermediaries between lenders, usually banks, and borrowers. They neither underwrite loans nor approve loans nor share in any risk associated with the loans they originate. Brokers act as liaisons between borrowers and lenders. Brokers compile documentation (employment, income, assets, credit reports), help the borrower to settle on a loan amount and loan type, and then submit the loan for approval to a lender with whom the broker works. The broker knows which loans to apply for because he or she has the lender’s, or lenders’, rate sheet for the day that lists the various loan programs offered by the lender, the interest rates for those programs, the corresponding rebates (a.k.a. yield spread premium—see below), as well as price adjustments for loan amount, loan-to-value percentage tiers, credit score, geographic location, and lock extensions (Day 7, 10, 15, etc.). Mortgage brokers in 2000 had no fiducial responsibility to act in the best interest of their clients.7 Compensation for brokers originating loans generally comes in two ways. First, most brokers typically charge a origination fee, which is generally paid by the borrower and usually amounts to 1–2% of the loan amount.8 The second major source of compensation for a broker is the overage. An overage is the extra amount that a borrower may pay beyond the posted price of a mortgage (the combination of interest rate and points). Thus, we can think of an overage as the total loan points minus any fees (such as origination fees) minus the posted rate. Mortgage brokers generally share in the monies from overages, suggesting that in an efficient market we would see no discrimination based on race as brokers have an incentive to extract the maximum from all concerned. However, brokers typically have a great deal of flexibility with respect to setting overages. Furthermore, most borrowers are unaware of the existence of overages (Black et al., 2003, p. 1141).9 Discretionary mortgage pricing is therefore ripe for potential discrimination. To understand how overages work, consider a situation where the posted price (the price on the rate sheet given by the lender to the broker) on a particular loan is 6% and 0 points. Now imagine that the broker induces the borrower (who is not shown the rate sheet provided by the lender) to pay 6% and 1 point. That point is an overage. On a loan of $\$$300,000, a 1 point overage is worth $\$$3000. Typically, brokers pocketed half of the money generated by the overage.10 Thus, the broker could increase his or her commission by $\$$1500. Brokers can also collect overages through the use of rebates, which are also known as the yield spread premium. The rate sheet mentioned in the last paragraph might have on it the 6% loan with 0 points and a 6.75% loan with $$-$$1.75 points. That means that the lender will pay 1.75 points on a 6.75% loan. Now imagine that the broker gets the borrower to take the 6.75% loan with $$-$$0.75 points, the remaining point becomes the overage. Brokers like this approach because borrowers do not have to pay the fee up front, but they pay for it in higher interest rates.11 Not all borrowers pay positive overages. Some pay no overage, and others pay negative overages (called underages). Underages are usually the result of competition between lending institutions over a particular type of loan or a particularly desirable customer. Financially sophisticated borrowers can rate shop among competing lenders and drive down the price of their mortgage (Harney, 1993). Underages are rare because shopping and negotiating for mortgages are rare (we present evidence in Section 5), and our dependent variable is correspondingly skewed to the right. Only two major studies of overages exist.12Courchane and Nickerson (1997) analyze data from three banks following an investigation by the Office of the Comptroller of the Currency. The loans were made between 1992 and 1994. They provide regression results for two of the banks, Bank A and Bank B, and although small differences are found, they conclude that the banks were profit-maximizing, but not discriminatory.13Black et al. (2003) study 1996 data from a major bank. They find no evidence of discrimination once they control for other differences in the borrower pool, such as bargaining ability.14 Our data and analysis differ significantly from these previous studies. Courchane and Nickerson (1997) and Black et al. (2003) consider data obtained from banks in the early to mid-1990s, whereas our data comes from a mortgage brokerage company in 2000. The difference is meaningful in two ways. First, loan officers, as opposed to mortgage brokers, receive a set salary, while mortgage brokers earn most of their compensation through commissions. Loan officers and brokers therefore have different incentives. Second, banks making mortgage loans in the mid-1990s were under heightened scrutiny following the 1992 Federal Reserve mortgage lending study. Mortgage brokerages in 2000 were under no such scrutiny, as the subprime mortgage crisis would soon attest. For instance, the bank studied by Black et al. (2003) limited overages to 2%. This restriction explains why in a sample size of 2002, only 17.9% of the sample have positive overages. In our data (introduced in the next section), 54.5% of the borrowers in our sample have positive overages, and nearly 8% of the overages in the sample are larger than 2 (the largest overage is 6; see Table 1). Table 1. Variable Names, Definitions, and Descriptive Statistics Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Note: This table lists the major variables used in the analysis, the definitions of those variables, and descriptive statistics where appropriate. NA stands for not applicable. Table 1. Variable Names, Definitions, and Descriptive Statistics Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Note: This table lists the major variables used in the analysis, the definitions of those variables, and descriptive statistics where appropriate. NA stands for not applicable. Courchane and Nickerson’s (1997) sample from Bank A contains over 33,000 loans with 38.7% having positive overages. In their regression, however, the authors appear to have selected on the dependent variable and included only those borrowers with positive overages. The sample from Bank B has a larger percentage of borrowers with overages, but the regression results are questionable as the authors include interest rate as a control variable. The resulting endogeneity bias likely explains their strikingly counterintuitive finding that “minorities are likely to be charged smaller amounts of overage points” (Courchane and Nickerson, 1997, p. 147). Finally, Black et al. (2003) persuasively argue that credit risk should not explain much variation in mortgage pricing because risk is included in the base interest rate of the loan. Neither of the two previous studies, however, have direct access to credit scores with which to assess the claim. We can show that overage size is orthogonal to credit score. 4. Data and Descriptive Analysis The data comprise 2,129 observations on mortgage overages (measured in points) assessed by brokers for a large national mortgage company during the second quarter of 2000 (the loans were made between August 1, 2000 and October 31, 2000). The overages in the dataset are from loans actually taken, not just negotiated, and the observations comprise the universe of cases for the quarter. Although the firm had branch offices and brokered loans in all quadrants of the country, the majority of loans in the data were made in and around the Northeast. Providing adequate information to assess the representativeness of this mortgage brokerage is difficult as the data were provided to the authors by a principal of the firm on the condition of anonymity for the firm and the brokers. The fact that our data are from a single firm (as are the data in Black et al. (2003)) is concomitant with its uniqueness. In addition, the firm, which no longer exists as an independent entity, bore an instantly recognizable household name. The differences between our data and the data from Black et al. (2003) and Courchane and Nickerson (1997) are due to institution type (banks vs. mortgage broker) and temporal proximity to federal scrutiny (1992).15 Any mortgage data from the 2000s raises questions concerning subprime lending. The major increases in subprime mortgages, however, occurred between 2003 and 2006 (JCHS, 2008, p.4). Furthermore, through 2001, the ratio of home prices to income had remained steady between 2.9 and 3.1 for two decades. The ratio rose to 4.0 only in 2004, and 4.6 in 2006 (Steverman and Bogoslaw, 2008). There is little reason to assume that our data are either wildly unrepresentative or the product of the subprime crisis. We list the major variables used in our analysis, their definitions, and descriptive statistics in Table 1. The Home Mortgage Disclosure Act (HMDA) does not require mortgage companies to report credit scores for individual borrowers, the branch of the company that sold the loan, or the name of the lending officer. Researchers analyzing lending discrimination commonly make do without such measures. Our dataset, on the other hand, not only includes three credit scores for each borrower, but also the branch office that originated the loan and the name of the broker. These variables allow us to use fixed effects and clustering in the following analyses. Additionally, we have a suite of variables that mortgage companies must report to the federal government under the HMDA. These variables include the date of application, the race of the borrower (Hispanic, black, and white), and whether the loan is part of a program administered by The Federal Housing Administration. Data on other ethnicities and product types are available, but as we describe below, they collectively comprise an insignificant portion of the observations. Other demographic variables such as age and sex are available, but are not used because of missing data concerns. Age is missing for over 10% of the observations, and sex is missing for nearly 30%. Saturated models run using these variables do not change our conclusions (see the Appendix for regression results from an example regression including age and interactions of age and credit score). We describe the data in greater detail below to accomplish three goals. The first is to justify the use of the covariates for which we control. The second is to demonstrate that the positive relationship between overages and minority status is not the result of boosting an effect through selective covariate choice. The third is to give the reader a better feel for the data and the relationships that we highlight. In our sample, 54.2% of the borrowers have positive overages, which means that over half of the sample paid more for their loan than the price quoted to the broker by the bank.16 Of those with positive overages, the mean overage is 0.962 with a SD of 0.85. White borrowers comprise 85% of the sample, while black and Hispanic borrowers collectively comprise 10.8% of the sample (each about 5%). Asian and South Asian Indians make up less than 4% of the sample. A cross-tab of overages by race (trichotomized into “under,” “none,” and “over”) is in Table 2. Slightly over half of white borrowers have positive overages (53%) whereas over three-quarters of black (79%) and Hispanic (78%) borrowers have positive overages. A $$\chi^{2}$$-test returns a $$P$$-value of 0 indicating that overage and race may not be independent of one another. Table 2. Overages by Race Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Note: This table comprises a cross-tab of overages (trichotomized into “under,” “none,” and “over”) by race. A $$\chi^{2}$$-test of independence returns a $$P$$-value of 0. Table 2. Overages by Race Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Note: This table comprises a cross-tab of overages (trichotomized into “under,” “none,” and “over”) by race. A $$\chi^{2}$$-test of independence returns a $$P$$-value of 0. Loan type is related to overages. The relationship, however, is driven completely by one particular kind of mortgage—those administered under various programs run by The Federal Housing Administration (FHA), which we label as government-type loans. Such loans are easier to qualify for and have lower down payments requirements and comprise 13.3% of the sample. Agency or conformable loans meet guidelines set forth by Fannie Mae and Freddie Mac and comprise 65% of the sample. (The dataset includes nine categories of loans such as adjustable rate mortgages, government mortgages, second mortgages, construction mortgages, etc. The categories other than agency and government contain fewer than 100 observations each.) The government category is of interest because 85% of these loans carry positive overages, as opposed to 53% of agency loans.17 Moreover, a strong relationship exists between government-type loans and race. Overall, 47% of blacks and 50% of Hispanics have government-type loans while only 16% of whites do. We can see these patterns clearly in a three-dimensional table of overage by race by loan type (see Table 3). Overall, 83% of whites with government loans have positive overages compared with 91% of blacks and 100% of Hispanics ($$n=40$$). Overall, 51% of whites with agency loans have positive overages compared with 79% of blacks and 75% of Hispanics.18 Table 3. Overages by Race and Loan Type    Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000     Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000  Note: This table comprises a three-way cross-tab of overages by loan type (agency or government) by race. Minorities, on average, pay higher overages than whites for both agency and government-type loans. Table 3. Overages by Race and Loan Type    Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000     Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000  Note: This table comprises a three-way cross-tab of overages by loan type (agency or government) by race. Minorities, on average, pay higher overages than whites for both agency and government-type loans. The relationship between government-type mortgages and overages and race is not the result of the differences between mortgages types but, rather, the characteristics of borrowers who take out such mortgages. FHA mortgages, for example, are insured by the FHA, generally require lower down payments (which can also take the form of gifts), and a portion of the closing costs can be included in the loan amount. Furthermore, borrowers pay an insurance premium, as conventional loan borrowers do, but this premium is not canceled when equity grows.19 In the analyses to follow, we condition on government-type loans but, in reality, we use this binary variable as a crude proxy for loan size, a variable to which we do not have access. It is unclear which way loan size cuts: borrowers with more wealth, who take out larger loans, are less likely to use brokers, but are more attractive customers and have more money that can be extracted by brokers in the form of overages. The strong correlation between government-type mortgages and overages, however, suggests that brokers extract greater overages from borrowers seeking smaller loans. 5. Empirical Analysis Here, we report results from our empirical analysis. We show estimated regression coefficients, evidence of robustness across a variety of specifications, and a discussion of inferential threats including omitted variables and selection issues. Two major findings emerge. The first is that credit scores are unrelated to overage size. This finding is less than surprising as we do not expect risk of default to be related to overages. We expand on this point in the discussion. The second is that race is related to overage size across all specifications. The results from five different multiple regressions are in Table 4.20 The consistency of the effects across the regressions is remarkable. In all five regressions, the effects of being black, being Hispanic, or having a government loan are positive, substantive, and highly statistically significant. The effects of the three credit scores, on the other hand, are essentially zero and do not even approach statistical significance.21 Table 4. Regression Results    OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368        OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368     Note: Overage is the dependent variable in each case. The specifications vary by controls and standard errors (SEs) (robust vs. clustered by branch office). The fourth regression includes fixed effects (by branch), and the fifth is quantile regression. Credit scores are divided by 100. SEE is the standard error of the estimate. Table 4. Regression Results    OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368        OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368     Note: Overage is the dependent variable in each case. The specifications vary by controls and standard errors (SEs) (robust vs. clustered by branch office). The fourth regression includes fixed effects (by branch), and the fifth is quantile regression. Credit scores are divided by 100. SEE is the standard error of the estimate. Table 5. Matching Estimates Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Note: This table comprises treatment effect estimates from matching done three ways (one-to-one, genetic, propensity score) with Abadie–Imbens SEs. The variable minority is black plus Hispanic. Table 5. Matching Estimates Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Note: This table comprises treatment effect estimates from matching done three ways (one-to-one, genetic, propensity score) with Abadie–Imbens SEs. The variable minority is black plus Hispanic. The regressions in Table 4 are all a version of the following basic specification that we use throughout the article:   \begin{eqnarray*} \mbox{Overage}_{i}&=&\beta_{0}+\beta_{1}\mbox{Black}_{i}+\beta_{2}\mbox{Hispanic}_{i}+\beta_{3}\mbox{Government}_{i}\\ && +\beta_{4}\mbox{Equifax}_{i}+\beta_{5}\mbox{TRW}_{i}+\beta_{6}\mbox{Experian}+\epsilon_{i} \end{eqnarray*} Columns 1 and 2 of Table 4 contain estimates from ordinary least squares (OLS) linear regressions with heteroskedastic-consistent standard errors (SEs). The regression in column 1 contains only a single measure of credit ($$\beta_{5}=\beta_{6}=0$$), while column 2 contains three such measures to demonstrate that no multicollinearity problem exists between the three measures.22 Column 3 contains regression estimates with the SEs clustered by branch. The loans in the sample originated from forty-six different branches, with some producing almost 200 loans during third quarter (Branch 230: 196) and other branches producing far fewer loans. Many of the observations are therefore not independent, and clustered SEs are appropriate.23 While some SEs have increased in size, no estimates have changed the significance. Similarly, we might imagine that some branches, by virtue of being in different areas, might routinely assess overages differently. Column 4 contains the results of a regression with branch fixed effects. The results are substantively indistinguishable from the previous results. Across the first four regressions, blacks on average pay 40% of a point more than white borrowers. Hispanics pay 50% of a point more, and FHA borrowers pay 60% of a point more.24 In terms of the example given in Section 3, blacks would pay on average $\$$1,200 more, Hispanics $\$$1,500 more, and FHA borrowers $\$$1,800 more. Black borrowers with FHA loans could well have a overage that is a full point larger than white borrowers with conventional loans. Contrast these effects with those of the various credit scores. A 100-point increase in an Exquifax credit scores is associated with a 1–2% of a point decrease in overage size. In terms of our example, a borrower with a credit score of 700 would pay on average $\$$30–60 less than a borrower with a credit score of 600. The estimated coefficient on Experian credit scores is positive but, as noted earlier, the SEs are relatively large indicating that these effects are just bouncing around zero. The risk of default is unrelated to overage size. Regression diagnostics, such as Cook’s D and residual analysis, indicate that influential observations and nonlinearity may pose a threat to our estimates and inferences. We therefore turn to robust estimators and nonlinear estimators. Column 5 of Table 4 contains results from a quantile regression (Koenker, 2005). We ran nine separate quantile regressions with quantiles set from 0.1 to 0.9. Setting the quantile below 0.6 resulted in hundreds of nonpositive fits; setting the quantile at 0.9 reproduces our results nearly identically. We report results with the quantile set at 0.6, which produces only 1 nonpositive fit and is close to the median. The results vary in minor ways from the other regressions, but no estimate changes direction or significance. The estimated coefficient on being Hispanic and black shrink, and the estimated coefficient on government increases. We might imagine that credit scores are nonlinearly related to overages. Figure 1 contains the results of a generalized additive model run on the same set of covariates as the previous regressions.25 The three lower plots are graphs of the effects of the various credit scores. None of them appear to be related substantively to overages. In each case, there is very little movement as credit scores increase (note the small range of the $$y$$-axis), and even where the confidence bands are tightest, zero is included. On the other hand, the confidence intervals for black, Hispanic, and government, which can only take the values 0 and 1, do not include zero (the three upper plots). The effect of being black appears to be somewhat smaller than what we saw in the linear regressions, and this result is echoed in one of the matching analyses to follow. The effects of being Hispanic and having a government loan remain unchanged.26 Figure 1. View largeDownload slide Generalized Additive Model Results. Note: The dependent variable is overages. The effects of the six right-hand side variables are displayed in graphical form with confidence bands (2 SEs) and rug plots. Note that the variables in the first row can only take the values 0 and 1. Figure 1. View largeDownload slide Generalized Additive Model Results. Note: The dependent variable is overages. The effects of the six right-hand side variables are displayed in graphical form with confidence bands (2 SEs) and rug plots. Note that the variables in the first row can only take the values 0 and 1. As a final robustness check, we assess the effect of being a minority in a less model-dependent manner by performing a matching analysis using three different methods: one-to-one, genetic (Sekhon, 2011), and propensity score (Rosenbaum and Rubin, 1983b).27 We also use three different treatments: being Hispanic, being black, and being a minority (the sum of Hispanic and black).28 For each treatment and method, we match on age and the three credit scores.29 The results from all three estimators are quite similar to each other and remarkably similar to the previous analyses. The effects of being a minority and being Hispanic are both of the expected size and significance. The effect of being black just misses significance at conventional levels.30 5.1. Potentially Omitted Variables We demonstrated above that credit scores are unrelated to overages. We also demonstrated that being black, Hispanic, or having a government-backed loan does seem to be related positively to overages. We still need to address two serious threats to our inferences: potentially omitted variables and selection bias. We would like to have a measure of financial sophistication as savvy consumers can reduce their chances of paying an overage. Some of the effect of financial sophistication is no doubt captured by credit score. Financially unsophisticated consumers are unlikely to have sound credit, given that credit scores reflect not only the availability of credit, but also the judicious use of past credit.31 That being said, another aspect of financial sophistication is the willingness and ability to shop around and negotiate. Our results are consistent with a world where whites are more likely to shop around and negotiate than minorities. It appears, however, that few borrowers do either. Information on the Internet has exploded regarding how to navigate the mortgage market in the wake of the subprime crisis. A search of “how to get a mortgage loan” brings up over 10 million hits. A scan of the top results demonstrates that every site includes advice regarding visiting multiple lenders. This information, however, has failed to change the behavior of consumers. Results from the 2014 National Survey of Mortgage Borrowers from the Consumer Financial Protection Bureau indicate that nearly 50% of consumers in 2014 who took out a home purchase mortgage considered only one lender or broker (Consumer Financial Protection Bureau, 2015). Nearly 90% of consumers considered only two lenders or brokers. Furthermore, 77% of consumers applied to only one lender or broker (nearly 100% applied only to two). Perhaps more importantly for our purposes, there are no significant differences between non-Hispanic whites and Hispanics and non-whites when it comes to the number of lenders considered or applied to. In fact, a slightly higher percentage of Hispanics and non-whites applied to two lenders than non-Hispanic whites (National Mortgage Database, 2017). The reasons for this lack of interest in shopping around are not that surprising. Borrowers tend to believe that mortgage lenders, unlike, say, car dealers, treat them well (and these results are from after the mortgage crisis). In a separate survey, Fannie Mae asked borrowers to select reasons for only getting one quote. The two main reasons were satisfaction with the first quote, and that they were most comfortable with the lender (Prevost, 2015).32 It strikes us as extremely unlikely that consumers, white or not, behaved differently in 2000 before the mortgage crisis. Nor is there reason to suspect that a gap in mortgage shopping between whites and minorities existed in 2000 that closed by 2013. Another inferential threat is that selection bias may mask the effect of credit scores, which would in turn diminish the effects of race. Prospective borrowers with relatively low credit scores are likely to have higher interest rates on average. An overage assessed atop an already high interest rate may make the loan unaffordable. If such borrowers select out of the sample (i.e., decide against taking the loan), we may underestimate the effect of credit scores on overages. To put it another way, the distribution of overages for relatively low-credit borrowers would be truncated at the point where the loans become prohibitively expensive. Fortunately, such selection bias is unlikely to be affecting our estimates. Overages are pure profit for the mortgage broker: it is not in his or her interest to deter potential borrowers with restrictive pricing. The broker should maximize the overage without driving the customer away. After all, it is the mortgage broker’s job to try and match each customer with a lender if at all possible. In fact, evidence exists that brokers push lenders to make riskier loans and that brokers send lenders mortgages of decreasing quality over time (Garmaise, 2009). Such borrowers then are likely to appear in the dataset. The arguments just outlined may not convince all readers who may prefer that we include the omitted variables or run the selection model. We would if we could. We do not have to rely, however, exclusively on arguments. The sensitivity analysis to follow addresses both threats. For example, it is now commonplace to understand selection bias as an omitted variables problem, where the inverse Mills ratio is the variable missing from the second-stage equation. If either the missing inverse Mills ratio or any other omitted variable would change our results, evidence will show up in the sensitivity analysis. 6. Sensitivity Analysis The collective results from our analysis suggest that being a minority borrower means paying on average somewhere between 40% and 60% of a point in overages more than white borrowers. The mortgage brokers in our sample, however, are privy to information not contained in our dataset, and omitted variable bias remains a possibility. The goal of our sensitivity analysis is to understand how much omitted variable bias is required to nullify the effect of being Hispanic or black on overages. Formal sensitivity analysis goes back to Cornfield et al. (1959) and was developed by Rosenbaum and Rubin (1983a) and Rosenbaum (1988). The method we use comes from Oster (2017), who builds on Altonjii et al. (2005). Consider a regression model   \[Y=\beta X+W_{1} +W_{2}+\epsilon,\] where $$X$$ is the scalar treatment, $$W_{1}=\Gamma\omega$$, where $$\omega$$ is a vector of observed controls, and $$W_{2}$$ is unobserved. Let $$R_{\max}$$ be the $$R^{2}$$ from a hypothetical regression that includes the omitted variable(s), $$W_{2}$$. The proportional selection relationship is33  \[ \delta\frac{\mbox{cov}(W_{1},X)}{\mbox{var}(W_{1})}=\frac{\mbox{cov}(W_{2},X)}{\mbox{var}(W_{2})}.\] Oster (2017, pp. 9–10) notes two types of robustness claims that we can make using her method. The first is to assume a value for $$R_{\max}$$ and then calculate the value of $$\delta$$ for which $$\beta=0$$. She argues that a value of “$$\delta=2$$, for example, would suggest that the unobservables would need to be twice as important as the observables to produce a treatment effect of zero.” The second approach is to use bounds on $$R_{\max}$$ and $$\delta$$ to develop a set of bounds for $$\beta$$, and then consider whether zero or some other value of interest falls in the bounds. We report results from both approaches in Table 6. Table 6. Selection on Unobservables ($$R_{\max}=0.125$$) Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Note: This table shows the sensitivity results for the analysis of the impact of minority status on overages. Controlled effects include government, type loan, and credit scores. Table 6. Selection on Unobservables ($$R_{\max}=0.125$$) Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Note: This table shows the sensitivity results for the analysis of the impact of minority status on overages. Controlled effects include government, type loan, and credit scores. Column 1 of Table 6 lists our two treatments variables, black and Hispanic. Column 2 lists what Oster refers to as the uncontrolled estimates along with the $$R^{2}$$s associated with the regressions. Column 3 lists the controlled estimates along with their associated $$R^{2}$$s. Column 4 lists the identified sets, which are bounded by the controlled effect and the bias-adjusted effect based on $$R_{\max}$$ given in the table and $$\delta=1$$. Column 5 lists the values of $$\delta$$ that would drive the respective effects to 0 given $$R_{\max}$$. Following (Oster, 2017, p. 3), we set $$R_{\max}=0.125$$, which is 1.3 times the observed $$R^{2}$$ from a regression of overage on the full set of covariates as in column 2 of Table 4.34 Neither identifying set includes 0, and both sets demonstrate a remarkable consistency with our previous results. The analysis indicates that being Hispanic means paying on average somewhere between 44% and 51% of a point in overages more than white borrowers. For blacks, this value is between 35% and 43% of a point on average. Turning to $$\delta$$ under the assumption that $$R_{\max}=0.125$$, the analysis indicates that for the Hispanic estimate, the unobservables would have to be 5 times more important than the observed variables for the effect to go 0. For the coefficient on black, the unobservables would have to be 4 times more important than the observed variables. Altonjii et al. (2005) suggest that $$\delta=1$$, which means that the observables are at least as important as the unobservables, may be an appropriate cutoff. 7. Discussion One means of interpreting these findings is through the lens of the law. There are two ways that a lending institution can run afoul of antidiscrimination laws: disparate treatment and disparate impact. The Federal Deposit Insurance Corporation’s Side by Side: A Guide to Fair Lending defines disparate treatment as a lender treating an applicant differently based on a prohibited category such as age, sex, race, or religion. Disparate impact occurs when a policy or practice applied equally to all applicants has a disproportionate adverse impact on applicants in a protected group. The difference lies in motive: a disparate treatment finding means that lenders discriminated intentionally, and disparate impact can be the result of purely neutral policies. Whether our results show disparate treatment or disparate impact depends on how convincing the reader finds the sensitivity analysis. If the reader is reasonably sure that no omitted variable exists that would significantly alter our results, then we have shown disparate treatment. We recognize that even readers who find our results troubling might hesitate before arriving at that conclusion.35 We note, however, that in January 2017, J.P. Morgan Chase agreed to pay $\$$55 million to settle a federal lawsuit just hours after it was filed (White, 2017). Government prosecutors claimed that the bank is responsible for mortgage brokers who charged Hispanic borrowers an additional $\$$968 and charged black borrowers an additional $\$$1,126 over the first 5 years of the loan. J.P. Morgan Chase argued that the brokers were independent subcontractors and admitted no wrongdoing. If the reader does not find the sensitivity analysis sufficiently convincing, then we cannot claim to have shown disparate treatment because we will not have demonstrated that minority borrowers paid larger overages as a result of intentional racial discrimination. We can, however, make the case that the brokerage was guilty of disparate impact, which refers to the impact, intended or not, of a facially neutral policy that has the effect of discriminating. Often used in cases of suspected unconscious bias, a disparate impact case would hinge on whether there is a legitimate business practice that necessitates the use of the discriminatory policy. It is difficult to imagine what the legitimate business practice would be in the case of brokers charging overages that do not adjust for risk. The fact that some lenders have done away with overages suggests that the practice is not integral to the mortgage lending business. 8. Conclusion Race-based discrimination in mortgage pricing is of concern to both scholars and policymakers. Finding clear evidence, however, has proven problematic due to omitted variable bias. Empirical results indicating discrimination are routinely met with claims that the findings could be compromised by unmeasured confounders. The resulting ambiguity has hampered, though not completely derailed, attempts at affecting change. Certainly, the fear of omitted variable bias has haunted the study of mortgage discrimination. We mitigate the issues raised by potential confounders by considering mortgage pricing by brokers (overages) and employing a formal sensitivity analysis. In doing so, we show that credit scores and, by extension, default risk, are unrelated to overages charged by brokers. Conversely, we show that being black or Hispanic is strongly associated with positive overages. In short, we find the kind of evidence of discrimination that many scholars believe exists, but have difficulty in demonstrating unequivocally. Our analysis covers one lender during one-quarter of 2000, but the results are nonetheless troubling. Furthermore, the J.P. Morgan Chase settlement suggests that the brokerage our data came from is not unique. Although Bank of America banned the use of overages, there remain calls for mortgage brokering that is not incentivized by selling overages out of concern for continued racial discrimination (van den Brand, 2015). We hope that our results contribute meaningfully to the discussion. We thank Michael Peress, Jake Bowers, Anderson Frey, Catherinc J. Carroll, the editor, and two anonymous reviewers for comments and suggestions. Bradley Smith provided research assistance. Errors are our own. Appendix Table A1. A more saturated model including age and interactions with age and credit score. The inclusion of age has no effect on our results.    Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109              Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109           Table A1. A more saturated model including age and interactions with age and credit score. The inclusion of age has no effect on our results.    Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109              Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109           Table A2. A semiparametric model using the same specification as the linear models in the main part of the article. Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Table A2. A semiparametric model using the same specification as the linear models in the main part of the article. Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Table A3. Balance Statistics from Genetic Matching: Treatment=Hispanic    Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758     Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758  Notes: The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the t-tests indicate that we fail to reject the null hypotheses of no difference after matching. Table A3. Balance Statistics from Genetic Matching: Treatment=Hispanic    Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758     Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758  Notes: The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the t-tests indicate that we fail to reject the null hypotheses of no difference after matching. Footnotes 1. There have also been a number of far smaller recent settlements, such as a $\$$700,000 fine agreed to by the Texas Champion Bank of Alice, TX, USA for discriminating against Latinos in February of 2013, a $\$$33 million settlement by New Jersey’s Hudson City Savings Bank for racial discrimination in September of 2015, and a $\$$10.6 million settlement for Mississippi-based Bancorp South for discrimination against African-Americans and other minorities in June of the following year. 2. Fines do not provide definitive proof of bias as lenders have an incentive to settle even if they are non-discriminators. 3. The U.S. government requires the collection of race data under law; for a discussion, see Taylor (2012). The data nearly always fall short of the information available to lenders. Researchers studying peer-to-peer lending where some individuals request and others provide loans (notably from the website Prosper.com) have access to far better data. Both Pope and Syndor (2011) and Ravina (2012) find similar evidence of black borrowers facing some additional costs, but also defaulting more. Regardless, this type of situation is qualitatively different from the choices made by for-profit companies and arguably less important from a public policy perspective. 4. Knowles et al. (2001) argue, for example, that motorists must in equilibrium carry illegal substances with equal probability regardless of race (Anwar and Fang, 2006). The reasoning is that if one subgroup carries contraband more often than another, the police would focus their searches on those motorists. The subgroup would respond by carrying contraband less often. 5. This number would have been much higher in 2000. 6. Licensed mortgage sales workers who assist customers with loan applications. They have discretion over how they respond to customer inquires (Hanson et al., 2016, p. 2). 7. California was the only exception in 2000, and none of our data comes from California. 8. Other types of fees that might be charged include processing fees or administration fees. 9. Guttentag (2010) draws an analogy with the marketing of carpets in Middle Eastern bazaars. He writes that everyone knows that bargaining is the rule in the bazaar, but few borrowers realize that they are in a mortgage bazaar. 10. Bank of America banned the practice in 2010 (Guttentag, 2010). 11. Brokers were required by law to disclose the fee in the Good Faith Estimate, but it was difficult for the borrower to discover it. The yield spread premium was only one name. It also went by the service release fee or par-plus pricing or rate participation fee. The yield spread premium was banned by the Federal Reserve in 2011. 12. We were unable to obtain the data from either study. The Black et al. (2003) data are proprietary, and Courchane and Nickerson (1997) were unable to locate their data. 13. For Bank A, overage is regressed on minority, type of loan, loan purpose, year, loan amount, loan to value, and an intercept. For Bank B, the authors provide both logit (overage or not) and OLS results. The covariates are loan amount, income amount, interest rate, loan type, loan purpose, occupancy, gender, Hispanic, black, and a regional dummy. 14. Black et al. (2003) present results from a Heckman-style selection model. Stage 1 is whether an overage is charged, and Stage 2 is the overage amount, if charged. The model is weakly identified by omitting whether the loan type is conventional from Stage 2 on the basis of insignificance. They control for thirty three covariates including their measure of bargaining skill, which is the mean overage the loan officer extracts from white applicants. 15. We have 2,129 loans in a 3-month period. Black et al. (2003) consider loans “made during 1996 for a major mortgage lending institution at its loan offices nationwide” (p. 1143). Their dataset comprises 2,002 loans for the entire year. Courchane and Nickerson (1997) have data from three institutions. The first, Huntington Mortgage Company, had 788 loans in a 6-month period. The second, Bank A, averaged 3,700 loans a quarter, and the third, Bank B, averaged 2,750 loans a quarter. Our sample size therefore is not particularly low. We also note that the mortgage market is seasonal (Zillow.com data indicates that more homes are sold during the Spring months), and that our data come from the Fall quarter. 16. In our sample, 36.4% have an overage of zero, and 9.4% have an underage. 17. A $$\chi^{2}$$-test returns a $$P$$-value of 0, indicating that loan type and overage may not be independent of one another. The other loan type categories behave as expected. Only 28.4% of second loans have overages, as second mortgages are taken out by those who have been through the process at least once before. 18. While it is clear that loan type plays a significant role, it is unclear whether product type is a pre- or post-treatment variable. The “treatment” here is the broker observing the minority status (yes or no) of the borrower. Whereas variables such as age and credit scores are fixed prior to treatment, product type is most likely not fixed before the “treatment” occurs. In the analyses to follow, we deal with loan type both ways. We include product type in regressions, but we do not match on it. All analyses, however, have been done both ways, and the presence or absence of product type among the control variables does not change any of our conclusions. 19. These are the rules that were in place as of 2002. Some rules may have changed in the ensuing years. 20. We also estimated specifications that included interaction terms (see the Appendix). No estimates changed direction or significance. 21. The credit scores have been divided by 100 to increase interpretability. 22. The correlations between the three credit scores are roughly 0.87, which is generally not high enough to produce serious multicollinearity with these many observations. The condition number of the full design matrix is 1, well below problematic levels (Belsley et al., 1980). 23. There are forty-six branches, so clustering by branch does not generate problems associated with a small number of clusters. 24. Conditioning on conventional loans made no difference to the analysis, and the variable itself was nonsignificant in every specification. 25. $$E[\mbox{Overage}|\boldsymbol{x}]=g_{1}(\mbox{Black})+g_{2}(\mbox{Hispanic})+g_{3}(\mbox{Government})+g_{4}(\mbox{Equifax})+g_{5}(\mbox{TRW})+g_{6}(\mbox{Experian})$$. 26. We also ran a suite of semiparametric models. See the Appendix for an example. Our results remain unchanged. 27. There is no reason to prefer one method to another. 28. Minority is used as a treatment to increase the number of treated observations. 29. Including the possibly post-treatment variable government-type loan makes no substantive difference. 30. Table A3 in the Appendix contains balance statistics from the genetic matching using Hispanic as the treatment. (Results from using black or minority as treatments are similar.) The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the $$t$$-tests indicate that we fail to reject the null hypotheses of no difference after matching. These results suggest that the matching is credible. 31. A 2003 report by the Federal Reserve based on a large, nationally representative sample found that credit scores are predictive of credit risk for the population as a whole and for all major demographic groups. That is, the higher the credit score, the lower the observed incidence of default (Board of Governors of the Federal Reserve System, 2007). 32. One of the claims made by mortgage brokers is that they do the “shopping around” for the borrower. 33. “Omitted variable bias is proportional to coefficient movements, but only if such movements are scaled by movements in $$R$$-squared” (Oster, 2017, p. 3). 34. Oster suggests 1.3 because it is the value at which 90% of results from randomized trials are robust. 35. We attempted to compare black and Hispanic brokers to white brokers directly. Following Anwar and Fang (2006), who compare the behavior of police officers of different races, if white and minority brokers behave in similar ways, then they are mostly likely profit-maximizing. We used the list of surnames occurring more than 100 times available from Census.gov to identify the race of each broker in our sample as well as using Facebook and LinkedIn to find online photographic evidence. Unfortunately for our analysis, however, there are simply too few minority brokers making too few loans to minority borrowers in the year 2000 for justifiable statistical analysis. References Altonjii, J. G., Elder, T. E. and Taber. C. R. 2005. “An evaluation of instrumental variable strategies for estimating the effects of catholic schooling,” 40 Journal of Human Resources  791– 821. Google Scholar CrossRef Search ADS   Anwar, S. and Fang. H. 2006. “An alternative test of racial prejudice in motor vehicle searches: Theory and evidence,” 96 American Economic Review  127– 51. Google Scholar CrossRef Search ADS   Ayres, I. 2001. Pervasive Prejudice? Unconventional Evidence of Race and Gender Discrimination . Chicago: The University of Chicago Press. Becker, G. S. 1993. “The Evidence against Banks Does Not Prove Bias,” Business Week . April 19, 1993. Belsley, D. A., Kuh, E. and Welsch. R. E. 1980. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity . Hoboken, NJ: John Wiley and Sons. Google Scholar CrossRef Search ADS   Black, H. A., Boehm, T. P. and DeGannaro. R. P. 2003. “Is There Discrimination in Mortgage Pricing? The case of overages,” 27 Journal of Banking & Finance  1139– 65. Google Scholar CrossRef Search ADS   Board of Governors of the Federal Reserve System. 2007. Report to the congress on credit scoring and its effects on the availability and affordability of credit. Technical report, Federal Reserve System. Consumer Financial Protection Bureau. 2015. Consumers’ mortgage shopping experience: A first look at results from the National Survey of Mortgage Borrowers. Technical report, Consumer Financial Protection Bureau. Cornfield, J., Haenszel, W. Hammond, E. C. Lilienfeld, A. M. Shimkin, M. B. and Wynder. E. L. 1959. “Smoking and Lung Cancer: Recent Evidence and a Discussion of Some Questions,” 22 Journal of the National Cancer Institute  173– 203. Google Scholar PubMed  Courchane, M. and Nickerson. D. 1997. “Discrimination Resulting from Overage Practices,” 11 Journal of Financial Services Research  133– 51. Google Scholar CrossRef Search ADS   Garmaise, M. J. 2009. After the honeymoon: Relationship dynamics between mortgage brokers and banks. Anderson School of Business, UCLA. Available at http://personal.anderson.ucla.edu/mark.garmaise/relationship.pdf. Garrison, T. 2014. First-time Homebuyers Still Face Challenge with Mortgage Process,” HousingWire . November 13, 2014. Glaser, J. 2014. Suspect Race: Causes and Consequences of Racial Profiling . New York: Oxford University Press. Google Scholar CrossRef Search ADS   Goenner, C. F. 2010. “Discrimination and Mortgage Lending in Boston: The Effects of Model Uncertainty.” 40 Journal of Real Estate Finance and Economics  260– 85. Google Scholar CrossRef Search ADS   Guttentag, J. 2010. “Why Mortgage Lenders Charge Overages, and Why They May Stop,” The Washington Post . March 27, 2010. Han, S. 2011. “Creditor Learning and Discrimination in Lending,” 40 Journal of Financial Services Research  1– 27. Google Scholar CrossRef Search ADS   Hanson, A., Hawley, Z. Martin, H. and Liu. B. 2016. “Discrimination in mortgage lending: Evidence from a correspondence experiment,” 92 Journal of Urban Economics  48– 65. Google Scholar CrossRef Search ADS   Harney, K. R. 1993. “Your Mortgage: Loan ‘Overage’ Charges Come under Fire,” Los Angeles Times . August 15, 1993. Heckman, J. and Siegelman. P. 1993. “The Urban Institute Audit Studies: Their Methods and Findings,” in Fix M. and Struyk, R. J. eds., Clear and Convincing Evidence: Measurement of Discrimination in America , pp. 187– 258. Washington, DC: The Urban Institute Press. Heckman, J. J. 1998. “Detecting discrimination,” 12 The Journal of Economic Perspectives  101– 16. Google Scholar CrossRef Search ADS   JCHS 2008. The state of the nation’s housing 2008. Technical report, Joint Center for Housing Studies of Harvard University. Kau, J. B., Keenan, D. C. and Munneke. H. J. 2012. “Racial Discrimination and Mortgage Lending,” 45 Journal of Real Estate Financial Economics  289– 304. Google Scholar CrossRef Search ADS   Knowles, J., Persico, N. and Todd. P. 2001. “Racial Bias in Motor Vehicle Searches: Theory and Evidence,” 109 Journal of Political Economy  203– 29. Google Scholar CrossRef Search ADS   Koenker, R. 2005. Quantile Regression . New York, NY: Cambridge University Press. Google Scholar CrossRef Search ADS   Munnell, A. H., Tootell, G. M. B. Browne, L. E. and McEneaney. J. 1996. “Mortgage Lending in Boston: Interpreting HMDA Data.” 86 American Economic Review  25– 53. National Mortgage Database. 2017. A profile of 2013 mortgage borrowers: Statistics from the National Survey of Mortgage Originations. Technical report, National Mortgage Database. Neumark, D. 2012. Detecting Discrimination in Audit and Correspondence Studies. 47 The Journal of Human Resources  1128– 57. CrossRef Search ADS   Oster, E. 2017. “Unobservable Selection and Coefficient Stability: Theory and Evidence,” 0 Journal of Business and Economic Statistics  1– 18. Pope, D. G. and Syndor. J. R. 2011. “What’s in a Picture? Evidence of Discrimination from Prosper.com,” 46 Journal of Human Resources  53– 92. Google Scholar CrossRef Search ADS   Prevost, L. 2015. “Consumer Protection for Mortgage Seekers,” The New York Times . January 23, 2015. Ravina, E. 2012. Love & loans: The effect of beauty and personal characteristics in credit markets. Columbia Business School. Available at SSRN: https://ssrn.com/abstract=1107307. Rosenbaum, P. and Rubin. D. 1983a. “Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome,” 45 Journal of the Royal Statistical Society, Series B  212– 18. Rosenbaum, P. R. 1988. “Sensitivity Analysis for Matching with Multiple Controls,” 75 Biometrika  577– 81. Google Scholar CrossRef Search ADS   Rosenbaum, P. R. and Rubin. D. B. 1983b. “The Central Role of the Propensity Score in Observational Studies for Causal Effect,” 70 Biometrika  41– 55. Google Scholar CrossRef Search ADS   Ross, S. L. and Yinger. J. 2002. The Color of Credit: Mortgage Discrimination, Research Methodology, and Fair-Lending Enforcement . Cambridge, MA: The MIT Press. Sekhon, J. S. 2011. “Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R.” 42 Journal of Statistical Software  1– 52. Google Scholar CrossRef Search ADS   Steverman, B. and Bogoslaw. D. 2008. “The Financial Crisis Blame Game,” Bloomberg Business . October 18, 2008. Taylor, W. 2011-2012. “Proving Racial Discrimination and Monitoring Fair Labor Compliance: The Missing Data Problem in Nonmortgage Credit,” 31 Review of Banking and Financial Laws  199– 264. van den Brand, J. 2015. “Freedom at Last from Discriminatory Lending: A Fairer Future with Technology,” Huffington Post . June 8, 2015. White, G. B. 2017. “J.P. Morgan Chase’s $\$$55 Million Discrimination Settlement,” The Atlantic . January 18, 2017. Published by Oxford University Press on behalf of American Law and Economics Review 2017. This work is written by US Government employee and is in the public domain in the US. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png American Law and Economics Review Oxford University Press

Mortgage Pricing and Race: Evidence from the Northeast

Loading next page...
 
/lp/oxford-university-press/mortgage-pricing-and-race-evidence-from-the-northeast-XfENrNN6o0
Publisher
Oxford University Press
Copyright
Published by Oxford University Press on behalf of American Law and Economics Review 2017. This work is written by US Government employee and is in the public domain in the US.
ISSN
1465-7252
eISSN
1465-7260
DOI
10.1093/aler/ahx021
Publisher site
See Article on Publisher Site

Abstract

Abstract The putative existence of race-based discrimination in mortgage pricing is both a scholarly and societal concern. Efforts to assess discrimination empirically, however, are typically plagued by omitted variables, which leave any evidence of discrimination open to interpretation. We take a two-pronged approach to the problem. First, we analyze a dataset comprising discretionary mortgage fees collected by brokers working for a brokerage company. Mortgage brokers are intermediaries between lenders and borrowers; they neither approve loans nor share in the risk of default. Variables that measure risk should therefore have no effect on these discretionary fees, and indeed, we show that default risk as measured by credit scores have no effect on discretionary pricing. Second, we perform a formal sensitivity analysis that quantifies the impact of potentially omitted variables. Our results suggest that minority borrowers pay more on average for mortgages than non-minorities, and that this effect persists even in the presence of unmeasured confounders. 1. Introduction In July of 2012, Wells Fargo Bank agreed to a $\$$175 million fine on the grounds that it had discriminated against African-American and Hispanic borrowers between 2004 and 2009 in violation of fair lending laws. The bank, while acquiescing to the penalty, nonetheless denied the government’s charges, claiming that it settled due to the prohibitive litigation costs involved. The bank later asserted that a corporate decision to cease providing mortgages through independent brokers was a completely independent choice. The Wells Fargo payment closely followed a $\$$335 million fine agreed to by Bank of America in late 2011. The Department of Justice complaint in this case alleged that more than 200,000 black and Hispanic borrowers paid more for loans than white borrowers because of race, not borrower risk.1 Despite these troubling charges, statistical evidence for racial discrimination in the mortgage industry is both scarce and of questionable quality.2 Determining the existence of discrimination in mortgage lending would be straightforward if the data comprised identically credit-worthy and sophisticated individuals applying for comparable loans to the same, or randomly assigned, brokers. Instead, scholars must rely on observational data and statistical controls, and the threat from omitted variable bias is ever present. Banks rarely make sensitive borrower information, such as credit scores, available to researchers, and scholars must make due with data that are far from ideal.3Kau et al. (2012), for example, rely on neighborhood racial and ethnic compositions because they lack actual data on the borrowers themselves. Neither of the existing studies of overages, Black et al. (2003) and Courchane and Nickerson (1997), includes credit scores as a control. Even the well-known dataset collected by the Federal Reserve of Boston lacks clean measures of applicant creditworthiness (for the best-known analysis of these data, see Munnell et al. (1996); for recent reanalyses, see Goenner (2010) and Han (2011)). We attack the omitted confounders problem by combining a new dataset with a formal sensitivity analysis. The dataset is unusual in two ways. First, the dataset concerns discretionary mortgage pricing by brokers who are intermediaries between lenders and borrowers. The brokers do not approve loans, a decision that is made by the lender, nor do they share in the risk, which is shouldered by the lender. Mortgage brokers are not compensated based on loan performance (whether a borrower repays a loan or defaults), and the fees they charge do not adjust for risk. Theoretically, we do not need to control for variables associated with default risk and that are routinely unobserved by researchers, for example, debt-to-loan ratio. That being said, risk adjustment remains a compelling alternative explanation whenever differential loan pricing is observed. We address the issue using the second unusual feature of the dataset: three credit scores for each borrower, one from each of the three major credit bureaus. As noted above, these data are quite rare. Despite the dataset’s unique nature, there remain variables that we would like to have, but do not have. Prominent among these are loan size, the financial sophistication of the borrower, and the borrower’s ability to negotiate. Our response is two-fold. First, we present evidence that borrowers rarely shop around for mortgages, a fact which severely limits the ability of a borrower to negotiate with her broker. Second, we employ a sophisticated sensitivity analysis that allows us to assess whether any additional unobserved variables exist that could change our results significantly. The results of the sensitivity analysis suggest that our findings are not sensitive to omitted confounders. Our empirical strategy unfolds in three stages. First, we use regression (overage is a continuous variable) to show that minority status is associated on average with larger overages, while credit scores (a measure of default risk) are unrelated to overages. Second, we report a number of robustness checks and demonstrate that our findings are consistent across many different specifications. These alternative specifications include quantile regression to address outliers, general additive models to check for nonlinearities, and matching estimators to assess model dependence. Third, we introduce and describe the results from the sensitivity analysis. Our conclusion is that a nontrivial amount of race-based discrimination exists in these data. The article proceeds as follows. In Section 2, we describe the problem of unobserved heterogeneity in discrimination research and discuss some recent attempts at solutions. Section 3 defines the roles of mortgage brokers, introduces the concept of an overage, and reviews past research. Section 4 describes our dataset. The empirical analysis is discussed in Section 5, and the sensitivity analysis in Section 6. Section 7 concludes with a brief discussion of discrimination law, and how the law relates to our results. 2. Discrimination and Unobserved Heterogeneity Omitted variables bedevil the study of discrimination in mortgage lending. As Ross and Yinger (2002, p. 108) note, the Boston Federal Reserve Study has more control variables than any previous study (an additional thirty eight), but was still widely attacked for leaving out important explanatory variables. Researchers studying discrimination in other areas experience similar problems and have developed two ways of dealing with missing variables. The first is known as audit or matched pair testing where two volunteers (or confederates) of different races attempt to get help at a retail outlet, obtain a loan, or be hired for a job. The key to this design is that the two volunteers must be alike in every way except race (or gender) to the store clerk, mortgage broker, or employer. Any difference in service or outcome may then be attributed to discrimination. Heckman (1998) criticizes audit tests for fragility in the face of assumptions regarding unobservable variables. No two people are alike in every discernible way except for race (or gender). Researchers have attempted to overcome this flaw by using correspondence tests, where the employer, for example, does not observe the matched applicants (Neumark, 2012). Heckman and Siegelman (1993), however, argue that discrimination is unidentified even in correspondence tests when the variance of unobserved productivity differs across groups. The experimental approach is necessarily limited in scope even if this problem could be addressed (Neumark (2012) proposes a test). A second method widely used by researchers is the outcome test first suggested by Becker (1993). Outcome tests consider the difference in average outcomes between two groups that have had a common experience (Glaser, 2014). If the difference is significant, we can infer that the two groups have been treated unequally. In Becker’s original formulation, if minorities default on loans at lower rates than non-minorities, there is evidence that minorities faced more stringent requirements than non-minorities to secure a loan (Glaser, 2014). Outcome tests are not susceptible to omitted variables bias because the variables that are unobserved by the researcher are observed by the loan officer or police officer, and under the hypothesis of no discrimination, are taken into account. Ayres (2001) criticizes outcome tests for what he terms the inframarginality problem. Discrimination is most likely to occur in marginal cases; well-qualifed borrowers receive loans, and nonqualifed borrowers do not (Glaser, 2014). In the mortgage context, marginal default rates are unobservable, and comparing averages confuses these marginal cases with the nonmarginal (inframarginal) cases (Ayres, 2001, p. 408). An exception exists when we expect an equilibrium to hold under the null hypothesis of no discrimination. In that case, no difference exists between marginal and nonmarginal cases.4 An outcome test applied to mortgage decisions would suffer from the inframarginality problem as there is no reason to assume equilibrium behavior. We do not, for example, expect to observe mortgage lenders targeting minorities to the exclusion of nonminorities with the expectation of higher profits. Neither approach discussed above has been widely used to study mortgage lending discrimination. Correspondence studies require online transactions, and nearly 50% of first-time homebuyers in 2014 met with a loan representative in person (Garrison, 2014).5 To our knowledge, the only correspondence study relating to mortgages is Hanson et al. (2016), who look at mortgage loan originators (MLO).6 They find an effect of being African-American on MLO response roughly equivalent to a credit score that is 50 points lower. MLOs also offer more details to whites and use friendlier language in email correspondence. These results are roughly consistent with our findings. Our data dictates our approach, which concerns fees charged by mortgage brokers who are not involved in the decision to approve or deny loans. In addition, they do not share in the risk involved in making loans. We therefore do not need to control for many of the variables associated with loan approval and default risk. On the other hand, we do need to control for variables related to loan size and bargaining. As we argue in the following sections, we have proxies for some of these variables, and we employ a sensitivity analysis to address the others. 3. The Nature of Overages Mortgage brokers are intermediaries between lenders, usually banks, and borrowers. They neither underwrite loans nor approve loans nor share in any risk associated with the loans they originate. Brokers act as liaisons between borrowers and lenders. Brokers compile documentation (employment, income, assets, credit reports), help the borrower to settle on a loan amount and loan type, and then submit the loan for approval to a lender with whom the broker works. The broker knows which loans to apply for because he or she has the lender’s, or lenders’, rate sheet for the day that lists the various loan programs offered by the lender, the interest rates for those programs, the corresponding rebates (a.k.a. yield spread premium—see below), as well as price adjustments for loan amount, loan-to-value percentage tiers, credit score, geographic location, and lock extensions (Day 7, 10, 15, etc.). Mortgage brokers in 2000 had no fiducial responsibility to act in the best interest of their clients.7 Compensation for brokers originating loans generally comes in two ways. First, most brokers typically charge a origination fee, which is generally paid by the borrower and usually amounts to 1–2% of the loan amount.8 The second major source of compensation for a broker is the overage. An overage is the extra amount that a borrower may pay beyond the posted price of a mortgage (the combination of interest rate and points). Thus, we can think of an overage as the total loan points minus any fees (such as origination fees) minus the posted rate. Mortgage brokers generally share in the monies from overages, suggesting that in an efficient market we would see no discrimination based on race as brokers have an incentive to extract the maximum from all concerned. However, brokers typically have a great deal of flexibility with respect to setting overages. Furthermore, most borrowers are unaware of the existence of overages (Black et al., 2003, p. 1141).9 Discretionary mortgage pricing is therefore ripe for potential discrimination. To understand how overages work, consider a situation where the posted price (the price on the rate sheet given by the lender to the broker) on a particular loan is 6% and 0 points. Now imagine that the broker induces the borrower (who is not shown the rate sheet provided by the lender) to pay 6% and 1 point. That point is an overage. On a loan of $\$$300,000, a 1 point overage is worth $\$$3000. Typically, brokers pocketed half of the money generated by the overage.10 Thus, the broker could increase his or her commission by $\$$1500. Brokers can also collect overages through the use of rebates, which are also known as the yield spread premium. The rate sheet mentioned in the last paragraph might have on it the 6% loan with 0 points and a 6.75% loan with $$-$$1.75 points. That means that the lender will pay 1.75 points on a 6.75% loan. Now imagine that the broker gets the borrower to take the 6.75% loan with $$-$$0.75 points, the remaining point becomes the overage. Brokers like this approach because borrowers do not have to pay the fee up front, but they pay for it in higher interest rates.11 Not all borrowers pay positive overages. Some pay no overage, and others pay negative overages (called underages). Underages are usually the result of competition between lending institutions over a particular type of loan or a particularly desirable customer. Financially sophisticated borrowers can rate shop among competing lenders and drive down the price of their mortgage (Harney, 1993). Underages are rare because shopping and negotiating for mortgages are rare (we present evidence in Section 5), and our dependent variable is correspondingly skewed to the right. Only two major studies of overages exist.12Courchane and Nickerson (1997) analyze data from three banks following an investigation by the Office of the Comptroller of the Currency. The loans were made between 1992 and 1994. They provide regression results for two of the banks, Bank A and Bank B, and although small differences are found, they conclude that the banks were profit-maximizing, but not discriminatory.13Black et al. (2003) study 1996 data from a major bank. They find no evidence of discrimination once they control for other differences in the borrower pool, such as bargaining ability.14 Our data and analysis differ significantly from these previous studies. Courchane and Nickerson (1997) and Black et al. (2003) consider data obtained from banks in the early to mid-1990s, whereas our data comes from a mortgage brokerage company in 2000. The difference is meaningful in two ways. First, loan officers, as opposed to mortgage brokers, receive a set salary, while mortgage brokers earn most of their compensation through commissions. Loan officers and brokers therefore have different incentives. Second, banks making mortgage loans in the mid-1990s were under heightened scrutiny following the 1992 Federal Reserve mortgage lending study. Mortgage brokerages in 2000 were under no such scrutiny, as the subprime mortgage crisis would soon attest. For instance, the bank studied by Black et al. (2003) limited overages to 2%. This restriction explains why in a sample size of 2002, only 17.9% of the sample have positive overages. In our data (introduced in the next section), 54.5% of the borrowers in our sample have positive overages, and nearly 8% of the overages in the sample are larger than 2 (the largest overage is 6; see Table 1). Table 1. Variable Names, Definitions, and Descriptive Statistics Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Note: This table lists the major variables used in the analysis, the definitions of those variables, and descriptive statistics where appropriate. NA stands for not applicable. Table 1. Variable Names, Definitions, and Descriptive Statistics Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Variable name  Definition  Min.  Mean  Max.  SD  BRANCH  Branch that originated the loan  NA  NA  NA  NA  OFFICER  Officer that sold the loan  NA  NA  NA  NA  DATE  Date of the transaction  8/1/00  NA  10/31/00  NA  GOVERNMENT  1=FHA loans  0.000  0.1327  1.000  0.339  OVERAGE  Overage in points  $$-$$5.000  0.4776  6.000  0.856  AGE  Borrower age, in years  19  39.81  80  11.03  BLACK  1=BLACK; 0=otherwise  0.000  0.050  1.000  0.218  HISPANIC  1=HISPANIC; 0=otherwise  0.000  0.058  1.000  0.234  WHITE  1=WHITE; 0=otherwise  0.000  0.850  1.000  0.357  EQUIFAX  Equifax credit score  495  706.9  829  60.30  TRANSUNION  TransUnion credit score  492  710.8  829  60.77  EXPERIAN  Experian credit score  476  708.9  831  60.12  Note: This table lists the major variables used in the analysis, the definitions of those variables, and descriptive statistics where appropriate. NA stands for not applicable. Courchane and Nickerson’s (1997) sample from Bank A contains over 33,000 loans with 38.7% having positive overages. In their regression, however, the authors appear to have selected on the dependent variable and included only those borrowers with positive overages. The sample from Bank B has a larger percentage of borrowers with overages, but the regression results are questionable as the authors include interest rate as a control variable. The resulting endogeneity bias likely explains their strikingly counterintuitive finding that “minorities are likely to be charged smaller amounts of overage points” (Courchane and Nickerson, 1997, p. 147). Finally, Black et al. (2003) persuasively argue that credit risk should not explain much variation in mortgage pricing because risk is included in the base interest rate of the loan. Neither of the two previous studies, however, have direct access to credit scores with which to assess the claim. We can show that overage size is orthogonal to credit score. 4. Data and Descriptive Analysis The data comprise 2,129 observations on mortgage overages (measured in points) assessed by brokers for a large national mortgage company during the second quarter of 2000 (the loans were made between August 1, 2000 and October 31, 2000). The overages in the dataset are from loans actually taken, not just negotiated, and the observations comprise the universe of cases for the quarter. Although the firm had branch offices and brokered loans in all quadrants of the country, the majority of loans in the data were made in and around the Northeast. Providing adequate information to assess the representativeness of this mortgage brokerage is difficult as the data were provided to the authors by a principal of the firm on the condition of anonymity for the firm and the brokers. The fact that our data are from a single firm (as are the data in Black et al. (2003)) is concomitant with its uniqueness. In addition, the firm, which no longer exists as an independent entity, bore an instantly recognizable household name. The differences between our data and the data from Black et al. (2003) and Courchane and Nickerson (1997) are due to institution type (banks vs. mortgage broker) and temporal proximity to federal scrutiny (1992).15 Any mortgage data from the 2000s raises questions concerning subprime lending. The major increases in subprime mortgages, however, occurred between 2003 and 2006 (JCHS, 2008, p.4). Furthermore, through 2001, the ratio of home prices to income had remained steady between 2.9 and 3.1 for two decades. The ratio rose to 4.0 only in 2004, and 4.6 in 2006 (Steverman and Bogoslaw, 2008). There is little reason to assume that our data are either wildly unrepresentative or the product of the subprime crisis. We list the major variables used in our analysis, their definitions, and descriptive statistics in Table 1. The Home Mortgage Disclosure Act (HMDA) does not require mortgage companies to report credit scores for individual borrowers, the branch of the company that sold the loan, or the name of the lending officer. Researchers analyzing lending discrimination commonly make do without such measures. Our dataset, on the other hand, not only includes three credit scores for each borrower, but also the branch office that originated the loan and the name of the broker. These variables allow us to use fixed effects and clustering in the following analyses. Additionally, we have a suite of variables that mortgage companies must report to the federal government under the HMDA. These variables include the date of application, the race of the borrower (Hispanic, black, and white), and whether the loan is part of a program administered by The Federal Housing Administration. Data on other ethnicities and product types are available, but as we describe below, they collectively comprise an insignificant portion of the observations. Other demographic variables such as age and sex are available, but are not used because of missing data concerns. Age is missing for over 10% of the observations, and sex is missing for nearly 30%. Saturated models run using these variables do not change our conclusions (see the Appendix for regression results from an example regression including age and interactions of age and credit score). We describe the data in greater detail below to accomplish three goals. The first is to justify the use of the covariates for which we control. The second is to demonstrate that the positive relationship between overages and minority status is not the result of boosting an effect through selective covariate choice. The third is to give the reader a better feel for the data and the relationships that we highlight. In our sample, 54.2% of the borrowers have positive overages, which means that over half of the sample paid more for their loan than the price quoted to the broker by the bank.16 Of those with positive overages, the mean overage is 0.962 with a SD of 0.85. White borrowers comprise 85% of the sample, while black and Hispanic borrowers collectively comprise 10.8% of the sample (each about 5%). Asian and South Asian Indians make up less than 4% of the sample. A cross-tab of overages by race (trichotomized into “under,” “none,” and “over”) is in Table 2. Slightly over half of white borrowers have positive overages (53%) whereas over three-quarters of black (79%) and Hispanic (78%) borrowers have positive overages. A $$\chi^{2}$$-test returns a $$P$$-value of 0 indicating that overage and race may not be independent of one another. Table 2. Overages by Race Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Note: This table comprises a cross-tab of overages (trichotomized into “under,” “none,” and “over”) by race. A $$\chi^{2}$$-test of independence returns a $$P$$-value of 0. Table 2. Overages by Race Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Overage  White  Black  Hispanic  Asian  Under  0.135  0.100  0.000  0.075  None  0.334  0.114  0.225  0.453  Over  0.531  0.786  0.775  0.472  Note: This table comprises a cross-tab of overages (trichotomized into “under,” “none,” and “over”) by race. A $$\chi^{2}$$-test of independence returns a $$P$$-value of 0. Loan type is related to overages. The relationship, however, is driven completely by one particular kind of mortgage—those administered under various programs run by The Federal Housing Administration (FHA), which we label as government-type loans. Such loans are easier to qualify for and have lower down payments requirements and comprise 13.3% of the sample. Agency or conformable loans meet guidelines set forth by Fannie Mae and Freddie Mac and comprise 65% of the sample. (The dataset includes nine categories of loans such as adjustable rate mortgages, government mortgages, second mortgages, construction mortgages, etc. The categories other than agency and government contain fewer than 100 observations each.) The government category is of interest because 85% of these loans carry positive overages, as opposed to 53% of agency loans.17 Moreover, a strong relationship exists between government-type loans and race. Overall, 47% of blacks and 50% of Hispanics have government-type loans while only 16% of whites do. We can see these patterns clearly in a three-dimensional table of overage by race by loan type (see Table 3). Overall, 83% of whites with government loans have positive overages compared with 91% of blacks and 100% of Hispanics ($$n=40$$). Overall, 51% of whites with agency loans have positive overages compared with 79% of blacks and 75% of Hispanics.18 Table 3. Overages by Race and Loan Type    Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000     Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000  Note: This table comprises a three-way cross-tab of overages by loan type (agency or government) by race. Minorities, on average, pay higher overages than whites for both agency and government-type loans. Table 3. Overages by Race and Loan Type    Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000     Overage  Agency  Government  White  Under  0.160  0.095     None  0.333  0.079     Over  0.507  0.826  Black  Under  0.083  0.000     None  0.125  0.091     Over  0.792  0.909  Hispanic  Under  0.000  0.000     None  0.250  0.000     Over  0.750  1.000  Note: This table comprises a three-way cross-tab of overages by loan type (agency or government) by race. Minorities, on average, pay higher overages than whites for both agency and government-type loans. The relationship between government-type mortgages and overages and race is not the result of the differences between mortgages types but, rather, the characteristics of borrowers who take out such mortgages. FHA mortgages, for example, are insured by the FHA, generally require lower down payments (which can also take the form of gifts), and a portion of the closing costs can be included in the loan amount. Furthermore, borrowers pay an insurance premium, as conventional loan borrowers do, but this premium is not canceled when equity grows.19 In the analyses to follow, we condition on government-type loans but, in reality, we use this binary variable as a crude proxy for loan size, a variable to which we do not have access. It is unclear which way loan size cuts: borrowers with more wealth, who take out larger loans, are less likely to use brokers, but are more attractive customers and have more money that can be extracted by brokers in the form of overages. The strong correlation between government-type mortgages and overages, however, suggests that brokers extract greater overages from borrowers seeking smaller loans. 5. Empirical Analysis Here, we report results from our empirical analysis. We show estimated regression coefficients, evidence of robustness across a variety of specifications, and a discussion of inferential threats including omitted variables and selection issues. Two major findings emerge. The first is that credit scores are unrelated to overage size. This finding is less than surprising as we do not expect risk of default to be related to overages. We expand on this point in the discussion. The second is that race is related to overage size across all specifications. The results from five different multiple regressions are in Table 4.20 The consistency of the effects across the regressions is remarkable. In all five regressions, the effects of being black, being Hispanic, or having a government loan are positive, substantive, and highly statistically significant. The effects of the three credit scores, on the other hand, are essentially zero and do not even approach statistical significance.21 Table 4. Regression Results    OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368        OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368     Note: Overage is the dependent variable in each case. The specifications vary by controls and standard errors (SEs) (robust vs. clustered by branch office). The fourth regression includes fixed effects (by branch), and the fifth is quantile regression. Credit scores are divided by 100. SEE is the standard error of the estimate. Table 4. Regression Results    OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368        OLS Robust SEs  OLS Robust SEs  OLS Cluster: branch  Fixed by branch  Quantile  Intercept  0.4538  0.4546  0.4546     1.6430     (0.050)  (0.050)  (0.097)     (0.338)  Black  0.4337  0.4285  0.4285  0.4065  0.3512     (0.143)  (0.143)  (0.116)  (0.109)  (0.148)  Hispanic  0.5143  0.5110  0.5110  0.5085  0.3485     (0.162)  (0.161)  (0.258)  (0.107)  (0.224)  Government  0.6425  0.6392  0.6392  0.6635  1.0542     (0.070)  (0.071)  (0.162)  (0.653)  (0.108)  Equifax  −0.0142  −0.0145  −0.0142  −0.0110  0.0099     (0.018)  (0.022)  (0.021)  (0.018)  (0.075)  TRW     −0.0093  −0.0083  −0.0079  −0.0923        (0.024)  (0.013)  (0.017)  (0.063)  Experian     0.0103  0.0086  0.0154  0.1291        (0.018)  (0.019)  (0.017)  (0.073)  $$N$$  1,905  1,905  1,905  1,905  1,905  Missing  235  235  235  235  235  SEE  0.8268  0.8271  0.8271  0.7985  $$\tau=0.6$$  $$R^{2}$$  0.095  0.096  0.096  0.368     Note: Overage is the dependent variable in each case. The specifications vary by controls and standard errors (SEs) (robust vs. clustered by branch office). The fourth regression includes fixed effects (by branch), and the fifth is quantile regression. Credit scores are divided by 100. SEE is the standard error of the estimate. Table 5. Matching Estimates Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Note: This table comprises treatment effect estimates from matching done three ways (one-to-one, genetic, propensity score) with Abadie–Imbens SEs. The variable minority is black plus Hispanic. Table 5. Matching Estimates Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Treatment  Estimate  AI SE  # Treated Obs  Matched # Obs  One-to-one              $$\qquad$$ Minority  0.3463  0.1273  114  1,655  $$\qquad$$ Hispanic  0.5596  0.2017  054  1,655  $$\qquad$$ Black  0.3172  0.1695  060  1,655  Genetic              $$\qquad$$ Minority  0.3388  0.1304  114  1,655  $$\qquad$$ Hispanic  0.5444  0.1943  054  1,655  $$\qquad$$ Black  0.3349  0.1815  060  1,655  Propensity score              $$\qquad$$ Minority  0.3443  0.1313  114  1,655  $$\qquad$$ Hispanic  0.4267  0.2076  054  1,655  $$\qquad$$ Black  0.2648  0.1846  060  1,655  Note: This table comprises treatment effect estimates from matching done three ways (one-to-one, genetic, propensity score) with Abadie–Imbens SEs. The variable minority is black plus Hispanic. The regressions in Table 4 are all a version of the following basic specification that we use throughout the article:   \begin{eqnarray*} \mbox{Overage}_{i}&=&\beta_{0}+\beta_{1}\mbox{Black}_{i}+\beta_{2}\mbox{Hispanic}_{i}+\beta_{3}\mbox{Government}_{i}\\ && +\beta_{4}\mbox{Equifax}_{i}+\beta_{5}\mbox{TRW}_{i}+\beta_{6}\mbox{Experian}+\epsilon_{i} \end{eqnarray*} Columns 1 and 2 of Table 4 contain estimates from ordinary least squares (OLS) linear regressions with heteroskedastic-consistent standard errors (SEs). The regression in column 1 contains only a single measure of credit ($$\beta_{5}=\beta_{6}=0$$), while column 2 contains three such measures to demonstrate that no multicollinearity problem exists between the three measures.22 Column 3 contains regression estimates with the SEs clustered by branch. The loans in the sample originated from forty-six different branches, with some producing almost 200 loans during third quarter (Branch 230: 196) and other branches producing far fewer loans. Many of the observations are therefore not independent, and clustered SEs are appropriate.23 While some SEs have increased in size, no estimates have changed the significance. Similarly, we might imagine that some branches, by virtue of being in different areas, might routinely assess overages differently. Column 4 contains the results of a regression with branch fixed effects. The results are substantively indistinguishable from the previous results. Across the first four regressions, blacks on average pay 40% of a point more than white borrowers. Hispanics pay 50% of a point more, and FHA borrowers pay 60% of a point more.24 In terms of the example given in Section 3, blacks would pay on average $\$$1,200 more, Hispanics $\$$1,500 more, and FHA borrowers $\$$1,800 more. Black borrowers with FHA loans could well have a overage that is a full point larger than white borrowers with conventional loans. Contrast these effects with those of the various credit scores. A 100-point increase in an Exquifax credit scores is associated with a 1–2% of a point decrease in overage size. In terms of our example, a borrower with a credit score of 700 would pay on average $\$$30–60 less than a borrower with a credit score of 600. The estimated coefficient on Experian credit scores is positive but, as noted earlier, the SEs are relatively large indicating that these effects are just bouncing around zero. The risk of default is unrelated to overage size. Regression diagnostics, such as Cook’s D and residual analysis, indicate that influential observations and nonlinearity may pose a threat to our estimates and inferences. We therefore turn to robust estimators and nonlinear estimators. Column 5 of Table 4 contains results from a quantile regression (Koenker, 2005). We ran nine separate quantile regressions with quantiles set from 0.1 to 0.9. Setting the quantile below 0.6 resulted in hundreds of nonpositive fits; setting the quantile at 0.9 reproduces our results nearly identically. We report results with the quantile set at 0.6, which produces only 1 nonpositive fit and is close to the median. The results vary in minor ways from the other regressions, but no estimate changes direction or significance. The estimated coefficient on being Hispanic and black shrink, and the estimated coefficient on government increases. We might imagine that credit scores are nonlinearly related to overages. Figure 1 contains the results of a generalized additive model run on the same set of covariates as the previous regressions.25 The three lower plots are graphs of the effects of the various credit scores. None of them appear to be related substantively to overages. In each case, there is very little movement as credit scores increase (note the small range of the $$y$$-axis), and even where the confidence bands are tightest, zero is included. On the other hand, the confidence intervals for black, Hispanic, and government, which can only take the values 0 and 1, do not include zero (the three upper plots). The effect of being black appears to be somewhat smaller than what we saw in the linear regressions, and this result is echoed in one of the matching analyses to follow. The effects of being Hispanic and having a government loan remain unchanged.26 Figure 1. View largeDownload slide Generalized Additive Model Results. Note: The dependent variable is overages. The effects of the six right-hand side variables are displayed in graphical form with confidence bands (2 SEs) and rug plots. Note that the variables in the first row can only take the values 0 and 1. Figure 1. View largeDownload slide Generalized Additive Model Results. Note: The dependent variable is overages. The effects of the six right-hand side variables are displayed in graphical form with confidence bands (2 SEs) and rug plots. Note that the variables in the first row can only take the values 0 and 1. As a final robustness check, we assess the effect of being a minority in a less model-dependent manner by performing a matching analysis using three different methods: one-to-one, genetic (Sekhon, 2011), and propensity score (Rosenbaum and Rubin, 1983b).27 We also use three different treatments: being Hispanic, being black, and being a minority (the sum of Hispanic and black).28 For each treatment and method, we match on age and the three credit scores.29 The results from all three estimators are quite similar to each other and remarkably similar to the previous analyses. The effects of being a minority and being Hispanic are both of the expected size and significance. The effect of being black just misses significance at conventional levels.30 5.1. Potentially Omitted Variables We demonstrated above that credit scores are unrelated to overages. We also demonstrated that being black, Hispanic, or having a government-backed loan does seem to be related positively to overages. We still need to address two serious threats to our inferences: potentially omitted variables and selection bias. We would like to have a measure of financial sophistication as savvy consumers can reduce their chances of paying an overage. Some of the effect of financial sophistication is no doubt captured by credit score. Financially unsophisticated consumers are unlikely to have sound credit, given that credit scores reflect not only the availability of credit, but also the judicious use of past credit.31 That being said, another aspect of financial sophistication is the willingness and ability to shop around and negotiate. Our results are consistent with a world where whites are more likely to shop around and negotiate than minorities. It appears, however, that few borrowers do either. Information on the Internet has exploded regarding how to navigate the mortgage market in the wake of the subprime crisis. A search of “how to get a mortgage loan” brings up over 10 million hits. A scan of the top results demonstrates that every site includes advice regarding visiting multiple lenders. This information, however, has failed to change the behavior of consumers. Results from the 2014 National Survey of Mortgage Borrowers from the Consumer Financial Protection Bureau indicate that nearly 50% of consumers in 2014 who took out a home purchase mortgage considered only one lender or broker (Consumer Financial Protection Bureau, 2015). Nearly 90% of consumers considered only two lenders or brokers. Furthermore, 77% of consumers applied to only one lender or broker (nearly 100% applied only to two). Perhaps more importantly for our purposes, there are no significant differences between non-Hispanic whites and Hispanics and non-whites when it comes to the number of lenders considered or applied to. In fact, a slightly higher percentage of Hispanics and non-whites applied to two lenders than non-Hispanic whites (National Mortgage Database, 2017). The reasons for this lack of interest in shopping around are not that surprising. Borrowers tend to believe that mortgage lenders, unlike, say, car dealers, treat them well (and these results are from after the mortgage crisis). In a separate survey, Fannie Mae asked borrowers to select reasons for only getting one quote. The two main reasons were satisfaction with the first quote, and that they were most comfortable with the lender (Prevost, 2015).32 It strikes us as extremely unlikely that consumers, white or not, behaved differently in 2000 before the mortgage crisis. Nor is there reason to suspect that a gap in mortgage shopping between whites and minorities existed in 2000 that closed by 2013. Another inferential threat is that selection bias may mask the effect of credit scores, which would in turn diminish the effects of race. Prospective borrowers with relatively low credit scores are likely to have higher interest rates on average. An overage assessed atop an already high interest rate may make the loan unaffordable. If such borrowers select out of the sample (i.e., decide against taking the loan), we may underestimate the effect of credit scores on overages. To put it another way, the distribution of overages for relatively low-credit borrowers would be truncated at the point where the loans become prohibitively expensive. Fortunately, such selection bias is unlikely to be affecting our estimates. Overages are pure profit for the mortgage broker: it is not in his or her interest to deter potential borrowers with restrictive pricing. The broker should maximize the overage without driving the customer away. After all, it is the mortgage broker’s job to try and match each customer with a lender if at all possible. In fact, evidence exists that brokers push lenders to make riskier loans and that brokers send lenders mortgages of decreasing quality over time (Garmaise, 2009). Such borrowers then are likely to appear in the dataset. The arguments just outlined may not convince all readers who may prefer that we include the omitted variables or run the selection model. We would if we could. We do not have to rely, however, exclusively on arguments. The sensitivity analysis to follow addresses both threats. For example, it is now commonplace to understand selection bias as an omitted variables problem, where the inverse Mills ratio is the variable missing from the second-stage equation. If either the missing inverse Mills ratio or any other omitted variable would change our results, evidence will show up in the sensitivity analysis. 6. Sensitivity Analysis The collective results from our analysis suggest that being a minority borrower means paying on average somewhere between 40% and 60% of a point in overages more than white borrowers. The mortgage brokers in our sample, however, are privy to information not contained in our dataset, and omitted variable bias remains a possibility. The goal of our sensitivity analysis is to understand how much omitted variable bias is required to nullify the effect of being Hispanic or black on overages. Formal sensitivity analysis goes back to Cornfield et al. (1959) and was developed by Rosenbaum and Rubin (1983a) and Rosenbaum (1988). The method we use comes from Oster (2017), who builds on Altonjii et al. (2005). Consider a regression model   \[Y=\beta X+W_{1} +W_{2}+\epsilon,\] where $$X$$ is the scalar treatment, $$W_{1}=\Gamma\omega$$, where $$\omega$$ is a vector of observed controls, and $$W_{2}$$ is unobserved. Let $$R_{\max}$$ be the $$R^{2}$$ from a hypothetical regression that includes the omitted variable(s), $$W_{2}$$. The proportional selection relationship is33  \[ \delta\frac{\mbox{cov}(W_{1},X)}{\mbox{var}(W_{1})}=\frac{\mbox{cov}(W_{2},X)}{\mbox{var}(W_{2})}.\] Oster (2017, pp. 9–10) notes two types of robustness claims that we can make using her method. The first is to assume a value for $$R_{\max}$$ and then calculate the value of $$\delta$$ for which $$\beta=0$$. She argues that a value of “$$\delta=2$$, for example, would suggest that the unobservables would need to be twice as important as the observables to produce a treatment effect of zero.” The second approach is to use bounds on $$R_{\max}$$ and $$\delta$$ to develop a set of bounds for $$\beta$$, and then consider whether zero or some other value of interest falls in the bounds. We report results from both approaches in Table 6. Table 6. Selection on Unobservables ($$R_{\max}=0.125$$) Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Note: This table shows the sensitivity results for the analysis of the impact of minority status on overages. Controlled effects include government, type loan, and credit scores. Table 6. Selection on Unobservables ($$R_{\max}=0.125$$) Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Treatment variables  Baseline effect [$$R^{2}$$]  Controlled effect [$$R^{2}$$]  Identified set $$R_{\max}=0.125$$  $$\delta$$ for $$\beta=0$$ Given $$R_{\max}$$  Black  0.656 [0.019]  0.432 [0.095]  [0.346, 0.432]  4.053  Hispanic  0.692 [0.021]  0.514 [0.095]  [0.444, 0.514]  5.370  Note: This table shows the sensitivity results for the analysis of the impact of minority status on overages. Controlled effects include government, type loan, and credit scores. Column 1 of Table 6 lists our two treatments variables, black and Hispanic. Column 2 lists what Oster refers to as the uncontrolled estimates along with the $$R^{2}$$s associated with the regressions. Column 3 lists the controlled estimates along with their associated $$R^{2}$$s. Column 4 lists the identified sets, which are bounded by the controlled effect and the bias-adjusted effect based on $$R_{\max}$$ given in the table and $$\delta=1$$. Column 5 lists the values of $$\delta$$ that would drive the respective effects to 0 given $$R_{\max}$$. Following (Oster, 2017, p. 3), we set $$R_{\max}=0.125$$, which is 1.3 times the observed $$R^{2}$$ from a regression of overage on the full set of covariates as in column 2 of Table 4.34 Neither identifying set includes 0, and both sets demonstrate a remarkable consistency with our previous results. The analysis indicates that being Hispanic means paying on average somewhere between 44% and 51% of a point in overages more than white borrowers. For blacks, this value is between 35% and 43% of a point on average. Turning to $$\delta$$ under the assumption that $$R_{\max}=0.125$$, the analysis indicates that for the Hispanic estimate, the unobservables would have to be 5 times more important than the observed variables for the effect to go 0. For the coefficient on black, the unobservables would have to be 4 times more important than the observed variables. Altonjii et al. (2005) suggest that $$\delta=1$$, which means that the observables are at least as important as the unobservables, may be an appropriate cutoff. 7. Discussion One means of interpreting these findings is through the lens of the law. There are two ways that a lending institution can run afoul of antidiscrimination laws: disparate treatment and disparate impact. The Federal Deposit Insurance Corporation’s Side by Side: A Guide to Fair Lending defines disparate treatment as a lender treating an applicant differently based on a prohibited category such as age, sex, race, or religion. Disparate impact occurs when a policy or practice applied equally to all applicants has a disproportionate adverse impact on applicants in a protected group. The difference lies in motive: a disparate treatment finding means that lenders discriminated intentionally, and disparate impact can be the result of purely neutral policies. Whether our results show disparate treatment or disparate impact depends on how convincing the reader finds the sensitivity analysis. If the reader is reasonably sure that no omitted variable exists that would significantly alter our results, then we have shown disparate treatment. We recognize that even readers who find our results troubling might hesitate before arriving at that conclusion.35 We note, however, that in January 2017, J.P. Morgan Chase agreed to pay $\$$55 million to settle a federal lawsuit just hours after it was filed (White, 2017). Government prosecutors claimed that the bank is responsible for mortgage brokers who charged Hispanic borrowers an additional $\$$968 and charged black borrowers an additional $\$$1,126 over the first 5 years of the loan. J.P. Morgan Chase argued that the brokers were independent subcontractors and admitted no wrongdoing. If the reader does not find the sensitivity analysis sufficiently convincing, then we cannot claim to have shown disparate treatment because we will not have demonstrated that minority borrowers paid larger overages as a result of intentional racial discrimination. We can, however, make the case that the brokerage was guilty of disparate impact, which refers to the impact, intended or not, of a facially neutral policy that has the effect of discriminating. Often used in cases of suspected unconscious bias, a disparate impact case would hinge on whether there is a legitimate business practice that necessitates the use of the discriminatory policy. It is difficult to imagine what the legitimate business practice would be in the case of brokers charging overages that do not adjust for risk. The fact that some lenders have done away with overages suggests that the practice is not integral to the mortgage lending business. 8. Conclusion Race-based discrimination in mortgage pricing is of concern to both scholars and policymakers. Finding clear evidence, however, has proven problematic due to omitted variable bias. Empirical results indicating discrimination are routinely met with claims that the findings could be compromised by unmeasured confounders. The resulting ambiguity has hampered, though not completely derailed, attempts at affecting change. Certainly, the fear of omitted variable bias has haunted the study of mortgage discrimination. We mitigate the issues raised by potential confounders by considering mortgage pricing by brokers (overages) and employing a formal sensitivity analysis. In doing so, we show that credit scores and, by extension, default risk, are unrelated to overages charged by brokers. Conversely, we show that being black or Hispanic is strongly associated with positive overages. In short, we find the kind of evidence of discrimination that many scholars believe exists, but have difficulty in demonstrating unequivocally. Our analysis covers one lender during one-quarter of 2000, but the results are nonetheless troubling. Furthermore, the J.P. Morgan Chase settlement suggests that the brokerage our data came from is not unique. Although Bank of America banned the use of overages, there remain calls for mortgage brokering that is not incentivized by selling overages out of concern for continued racial discrimination (van den Brand, 2015). We hope that our results contribute meaningfully to the discussion. We thank Michael Peress, Jake Bowers, Anderson Frey, Catherinc J. Carroll, the editor, and two anonymous reviewers for comments and suggestions. Bradley Smith provided research assistance. Errors are our own. Appendix Table A1. A more saturated model including age and interactions with age and credit score. The inclusion of age has no effect on our results.    Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109              Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109           Table A1. A more saturated model including age and interactions with age and credit score. The inclusion of age has no effect on our results.    Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109              Estimate  Std. Error  $$t$$-value  $$P(>|t|)$$  (Intercept)  0.0094  0.4542  0.02  0.9835  Black  0.3767  0.1142  3.30  0.0010  Hispanic  0.5403  0.1194  4.52  0.0000  Government  0.7399  0.0788  9.39  0.0000  Agency  0.1096  0.0514  2.13  0.0330  Age  0.0135  0.0110  1.23  0.2203  Equifax  −0.0475  0.0735  −0.65  0.5182  Experian  −0.0287  0.0669  −0.43  0.6673  TRW  0.0934  0.0764  1.22  0.2216  Age*Exp  0.0008  0.0017  0.46  0.6458  Age*Equi  0.0007  0.0019  0.39  0.6979  Age*TRW  −0.0029  0.0019  −1.49  0.1368  $$N$$  1623           SEE  0.836           $$R^{2}$$  0.109           Table A2. A semiparametric model using the same specification as the linear models in the main part of the article. Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Table A2. A semiparametric model using the same specification as the linear models in the main part of the article. Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Linear                 Estimate  SE  $$t$$-value  $$P(>|t|$$)  (Intercept)  1.7430  0.7297  2.39  0.0170  Black  0.2695  0.1233  2.19  0.0289  Hispanic  0.4497  0.1194  3.77  0.0002  Government  0.5732  0.0747  7.67  0.0000  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Nonlinear  (also see plots)        DF  Knots  f(Equifax)  1.0  34  f(TRW)  1.0  35  f(Experian)  2.5  36  Table A3. Balance Statistics from Genetic Matching: Treatment=Hispanic    Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758     Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758  Notes: The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the t-tests indicate that we fail to reject the null hypotheses of no difference after matching. Table A3. Balance Statistics from Genetic Matching: Treatment=Hispanic    Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758     Before  After  Age        $$\qquad$$ Mean treatment  35.148  35.148  $$\qquad$$ Mean control  39.905  35.204  $$\qquad$$ Std mean diff  −47.053  −0.5495  $$\qquad$$ t-test $$P$$-value  0.002  0.920  Government        $$\qquad$$ Mean treatment  0.4259  0.4259  $$\qquad$$ Mean control  0.1106  0.4259  $$\qquad$$ Std mean diff  63.194  0.000  $$\qquad$$ t-test $$P$$-value  0.0003  1.000  Equifax        $$\qquad$$ Mean treatment  665.04  665.04  $$\qquad$$ Mean control  685.59  666.19  $$\qquad$$ Std mean diff  −26.289  −1.469  $$\qquad$$ t-test $$P$$-value  0.0706  0.6451  TRW        $$\qquad$$ Mean treatment  617.41  617.41  $$\qquad$$ Mean control  689.23  618.13  $$\qquad$$ Std mean diff  −38.24  −0.3855  $$\qquad$$ t-test $$P$$-value  0.0072  0.5351  Experian        $$\qquad$$ Mean treatment  667.19  667.19  $$\qquad$$ Mean control  683.12  666.19  $$\qquad$$ Std mean diff  −24.427  1.5334  $$\qquad$$ t-test $$P$$-value  0.1016  0.7758  Notes: The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the t-tests indicate that we fail to reject the null hypotheses of no difference after matching. Footnotes 1. There have also been a number of far smaller recent settlements, such as a $\$$700,000 fine agreed to by the Texas Champion Bank of Alice, TX, USA for discriminating against Latinos in February of 2013, a $\$$33 million settlement by New Jersey’s Hudson City Savings Bank for racial discrimination in September of 2015, and a $\$$10.6 million settlement for Mississippi-based Bancorp South for discrimination against African-Americans and other minorities in June of the following year. 2. Fines do not provide definitive proof of bias as lenders have an incentive to settle even if they are non-discriminators. 3. The U.S. government requires the collection of race data under law; for a discussion, see Taylor (2012). The data nearly always fall short of the information available to lenders. Researchers studying peer-to-peer lending where some individuals request and others provide loans (notably from the website Prosper.com) have access to far better data. Both Pope and Syndor (2011) and Ravina (2012) find similar evidence of black borrowers facing some additional costs, but also defaulting more. Regardless, this type of situation is qualitatively different from the choices made by for-profit companies and arguably less important from a public policy perspective. 4. Knowles et al. (2001) argue, for example, that motorists must in equilibrium carry illegal substances with equal probability regardless of race (Anwar and Fang, 2006). The reasoning is that if one subgroup carries contraband more often than another, the police would focus their searches on those motorists. The subgroup would respond by carrying contraband less often. 5. This number would have been much higher in 2000. 6. Licensed mortgage sales workers who assist customers with loan applications. They have discretion over how they respond to customer inquires (Hanson et al., 2016, p. 2). 7. California was the only exception in 2000, and none of our data comes from California. 8. Other types of fees that might be charged include processing fees or administration fees. 9. Guttentag (2010) draws an analogy with the marketing of carpets in Middle Eastern bazaars. He writes that everyone knows that bargaining is the rule in the bazaar, but few borrowers realize that they are in a mortgage bazaar. 10. Bank of America banned the practice in 2010 (Guttentag, 2010). 11. Brokers were required by law to disclose the fee in the Good Faith Estimate, but it was difficult for the borrower to discover it. The yield spread premium was only one name. It also went by the service release fee or par-plus pricing or rate participation fee. The yield spread premium was banned by the Federal Reserve in 2011. 12. We were unable to obtain the data from either study. The Black et al. (2003) data are proprietary, and Courchane and Nickerson (1997) were unable to locate their data. 13. For Bank A, overage is regressed on minority, type of loan, loan purpose, year, loan amount, loan to value, and an intercept. For Bank B, the authors provide both logit (overage or not) and OLS results. The covariates are loan amount, income amount, interest rate, loan type, loan purpose, occupancy, gender, Hispanic, black, and a regional dummy. 14. Black et al. (2003) present results from a Heckman-style selection model. Stage 1 is whether an overage is charged, and Stage 2 is the overage amount, if charged. The model is weakly identified by omitting whether the loan type is conventional from Stage 2 on the basis of insignificance. They control for thirty three covariates including their measure of bargaining skill, which is the mean overage the loan officer extracts from white applicants. 15. We have 2,129 loans in a 3-month period. Black et al. (2003) consider loans “made during 1996 for a major mortgage lending institution at its loan offices nationwide” (p. 1143). Their dataset comprises 2,002 loans for the entire year. Courchane and Nickerson (1997) have data from three institutions. The first, Huntington Mortgage Company, had 788 loans in a 6-month period. The second, Bank A, averaged 3,700 loans a quarter, and the third, Bank B, averaged 2,750 loans a quarter. Our sample size therefore is not particularly low. We also note that the mortgage market is seasonal (Zillow.com data indicates that more homes are sold during the Spring months), and that our data come from the Fall quarter. 16. In our sample, 36.4% have an overage of zero, and 9.4% have an underage. 17. A $$\chi^{2}$$-test returns a $$P$$-value of 0, indicating that loan type and overage may not be independent of one another. The other loan type categories behave as expected. Only 28.4% of second loans have overages, as second mortgages are taken out by those who have been through the process at least once before. 18. While it is clear that loan type plays a significant role, it is unclear whether product type is a pre- or post-treatment variable. The “treatment” here is the broker observing the minority status (yes or no) of the borrower. Whereas variables such as age and credit scores are fixed prior to treatment, product type is most likely not fixed before the “treatment” occurs. In the analyses to follow, we deal with loan type both ways. We include product type in regressions, but we do not match on it. All analyses, however, have been done both ways, and the presence or absence of product type among the control variables does not change any of our conclusions. 19. These are the rules that were in place as of 2002. Some rules may have changed in the ensuing years. 20. We also estimated specifications that included interaction terms (see the Appendix). No estimates changed direction or significance. 21. The credit scores have been divided by 100 to increase interpretability. 22. The correlations between the three credit scores are roughly 0.87, which is generally not high enough to produce serious multicollinearity with these many observations. The condition number of the full design matrix is 1, well below problematic levels (Belsley et al., 1980). 23. There are forty-six branches, so clustering by branch does not generate problems associated with a small number of clusters. 24. Conditioning on conventional loans made no difference to the analysis, and the variable itself was nonsignificant in every specification. 25. $$E[\mbox{Overage}|\boldsymbol{x}]=g_{1}(\mbox{Black})+g_{2}(\mbox{Hispanic})+g_{3}(\mbox{Government})+g_{4}(\mbox{Equifax})+g_{5}(\mbox{TRW})+g_{6}(\mbox{Experian})$$. 26. We also ran a suite of semiparametric models. See the Appendix for an example. Our results remain unchanged. 27. There is no reason to prefer one method to another. 28. Minority is used as a treatment to increase the number of treated observations. 29. Including the possibly post-treatment variable government-type loan makes no substantive difference. 30. Table A3 in the Appendix contains balance statistics from the genetic matching using Hispanic as the treatment. (Results from using black or minority as treatments are similar.) The matching improves the balance on all five variables. In each case, the standard mean difference decreases significantly, and the $$t$$-tests indicate that we fail to reject the null hypotheses of no difference after matching. These results suggest that the matching is credible. 31. A 2003 report by the Federal Reserve based on a large, nationally representative sample found that credit scores are predictive of credit risk for the population as a whole and for all major demographic groups. That is, the higher the credit score, the lower the observed incidence of default (Board of Governors of the Federal Reserve System, 2007). 32. One of the claims made by mortgage brokers is that they do the “shopping around” for the borrower. 33. “Omitted variable bias is proportional to coefficient movements, but only if such movements are scaled by movements in $$R$$-squared” (Oster, 2017, p. 3). 34. Oster suggests 1.3 because it is the value at which 90% of results from randomized trials are robust. 35. We attempted to compare black and Hispanic brokers to white brokers directly. Following Anwar and Fang (2006), who compare the behavior of police officers of different races, if white and minority brokers behave in similar ways, then they are mostly likely profit-maximizing. We used the list of surnames occurring more than 100 times available from Census.gov to identify the race of each broker in our sample as well as using Facebook and LinkedIn to find online photographic evidence. Unfortunately for our analysis, however, there are simply too few minority brokers making too few loans to minority borrowers in the year 2000 for justifiable statistical analysis. References Altonjii, J. G., Elder, T. E. and Taber. C. R. 2005. “An evaluation of instrumental variable strategies for estimating the effects of catholic schooling,” 40 Journal of Human Resources  791– 821. Google Scholar CrossRef Search ADS   Anwar, S. and Fang. H. 2006. “An alternative test of racial prejudice in motor vehicle searches: Theory and evidence,” 96 American Economic Review  127– 51. Google Scholar CrossRef Search ADS   Ayres, I. 2001. Pervasive Prejudice? Unconventional Evidence of Race and Gender Discrimination . Chicago: The University of Chicago Press. Becker, G. S. 1993. “The Evidence against Banks Does Not Prove Bias,” Business Week . April 19, 1993. Belsley, D. A., Kuh, E. and Welsch. R. E. 1980. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity . Hoboken, NJ: John Wiley and Sons. Google Scholar CrossRef Search ADS   Black, H. A., Boehm, T. P. and DeGannaro. R. P. 2003. “Is There Discrimination in Mortgage Pricing? The case of overages,” 27 Journal of Banking & Finance  1139– 65. Google Scholar CrossRef Search ADS   Board of Governors of the Federal Reserve System. 2007. Report to the congress on credit scoring and its effects on the availability and affordability of credit. Technical report, Federal Reserve System. Consumer Financial Protection Bureau. 2015. Consumers’ mortgage shopping experience: A first look at results from the National Survey of Mortgage Borrowers. Technical report, Consumer Financial Protection Bureau. Cornfield, J., Haenszel, W. Hammond, E. C. Lilienfeld, A. M. Shimkin, M. B. and Wynder. E. L. 1959. “Smoking and Lung Cancer: Recent Evidence and a Discussion of Some Questions,” 22 Journal of the National Cancer Institute  173– 203. Google Scholar PubMed  Courchane, M. and Nickerson. D. 1997. “Discrimination Resulting from Overage Practices,” 11 Journal of Financial Services Research  133– 51. Google Scholar CrossRef Search ADS   Garmaise, M. J. 2009. After the honeymoon: Relationship dynamics between mortgage brokers and banks. Anderson School of Business, UCLA. Available at http://personal.anderson.ucla.edu/mark.garmaise/relationship.pdf. Garrison, T. 2014. First-time Homebuyers Still Face Challenge with Mortgage Process,” HousingWire . November 13, 2014. Glaser, J. 2014. Suspect Race: Causes and Consequences of Racial Profiling . New York: Oxford University Press. Google Scholar CrossRef Search ADS   Goenner, C. F. 2010. “Discrimination and Mortgage Lending in Boston: The Effects of Model Uncertainty.” 40 Journal of Real Estate Finance and Economics  260– 85. Google Scholar CrossRef Search ADS   Guttentag, J. 2010. “Why Mortgage Lenders Charge Overages, and Why They May Stop,” The Washington Post . March 27, 2010. Han, S. 2011. “Creditor Learning and Discrimination in Lending,” 40 Journal of Financial Services Research  1– 27. Google Scholar CrossRef Search ADS   Hanson, A., Hawley, Z. Martin, H. and Liu. B. 2016. “Discrimination in mortgage lending: Evidence from a correspondence experiment,” 92 Journal of Urban Economics  48– 65. Google Scholar CrossRef Search ADS   Harney, K. R. 1993. “Your Mortgage: Loan ‘Overage’ Charges Come under Fire,” Los Angeles Times . August 15, 1993. Heckman, J. and Siegelman. P. 1993. “The Urban Institute Audit Studies: Their Methods and Findings,” in Fix M. and Struyk, R. J. eds., Clear and Convincing Evidence: Measurement of Discrimination in America , pp. 187– 258. Washington, DC: The Urban Institute Press. Heckman, J. J. 1998. “Detecting discrimination,” 12 The Journal of Economic Perspectives  101– 16. Google Scholar CrossRef Search ADS   JCHS 2008. The state of the nation’s housing 2008. Technical report, Joint Center for Housing Studies of Harvard University. Kau, J. B., Keenan, D. C. and Munneke. H. J. 2012. “Racial Discrimination and Mortgage Lending,” 45 Journal of Real Estate Financial Economics  289– 304. Google Scholar CrossRef Search ADS   Knowles, J., Persico, N. and Todd. P. 2001. “Racial Bias in Motor Vehicle Searches: Theory and Evidence,” 109 Journal of Political Economy  203– 29. Google Scholar CrossRef Search ADS   Koenker, R. 2005. Quantile Regression . New York, NY: Cambridge University Press. Google Scholar CrossRef Search ADS   Munnell, A. H., Tootell, G. M. B. Browne, L. E. and McEneaney. J. 1996. “Mortgage Lending in Boston: Interpreting HMDA Data.” 86 American Economic Review  25– 53. National Mortgage Database. 2017. A profile of 2013 mortgage borrowers: Statistics from the National Survey of Mortgage Originations. Technical report, National Mortgage Database. Neumark, D. 2012. Detecting Discrimination in Audit and Correspondence Studies. 47 The Journal of Human Resources  1128– 57. CrossRef Search ADS   Oster, E. 2017. “Unobservable Selection and Coefficient Stability: Theory and Evidence,” 0 Journal of Business and Economic Statistics  1– 18. Pope, D. G. and Syndor. J. R. 2011. “What’s in a Picture? Evidence of Discrimination from Prosper.com,” 46 Journal of Human Resources  53– 92. Google Scholar CrossRef Search ADS   Prevost, L. 2015. “Consumer Protection for Mortgage Seekers,” The New York Times . January 23, 2015. Ravina, E. 2012. Love & loans: The effect of beauty and personal characteristics in credit markets. Columbia Business School. Available at SSRN: https://ssrn.com/abstract=1107307. Rosenbaum, P. and Rubin. D. 1983a. “Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome,” 45 Journal of the Royal Statistical Society, Series B  212– 18. Rosenbaum, P. R. 1988. “Sensitivity Analysis for Matching with Multiple Controls,” 75 Biometrika  577– 81. Google Scholar CrossRef Search ADS   Rosenbaum, P. R. and Rubin. D. B. 1983b. “The Central Role of the Propensity Score in Observational Studies for Causal Effect,” 70 Biometrika  41– 55. Google Scholar CrossRef Search ADS   Ross, S. L. and Yinger. J. 2002. The Color of Credit: Mortgage Discrimination, Research Methodology, and Fair-Lending Enforcement . Cambridge, MA: The MIT Press. Sekhon, J. S. 2011. “Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R.” 42 Journal of Statistical Software  1– 52. Google Scholar CrossRef Search ADS   Steverman, B. and Bogoslaw. D. 2008. “The Financial Crisis Blame Game,” Bloomberg Business . October 18, 2008. Taylor, W. 2011-2012. “Proving Racial Discrimination and Monitoring Fair Labor Compliance: The Missing Data Problem in Nonmortgage Credit,” 31 Review of Banking and Financial Laws  199– 264. van den Brand, J. 2015. “Freedom at Last from Discriminatory Lending: A Fairer Future with Technology,” Huffington Post . June 8, 2015. White, G. B. 2017. “J.P. Morgan Chase’s $\$$55 Million Discrimination Settlement,” The Atlantic . January 18, 2017. Published by Oxford University Press on behalf of American Law and Economics Review 2017. This work is written by US Government employee and is in the public domain in the US. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

American Law and Economics ReviewOxford University Press

Published: Apr 1, 2018

References