Regulatory Arbitrage or Random Errors? Implications of Race Prediction Algorithms in Fair Lending Analysis
Greenwald, Howell, Li, and Yimfor
Journal of Financial Economics 157: 103857, 2024.
Regulators often need to assess whether lenders are treating borrowers fairly by race—but in many credit contexts, like small business lending, self-reported race isn’t available. To fill this gap, regulators commonly rely on proxies such as the Bayesian Improved Surname Geocoding (BISG) algorithm, which infers race from last name and geographic data. The authors question whether such proxies introduce misleading errors and possibly encourage regulatory arbitrage, where lenders alter behavior to appear compliant, or simply result in random misclassification with unintended consequences. The study’s key motivation is to evaluate how these errors might distort measured racial disparities in lending outcomes.
To assess this, the authors analyze two datasets:
- A Lendio small business lending sample, where they observe real approval decisions.
- A Paycheck Protection Program (PPP) dataset, which includes self-identified race—a rare source of actual race data in non-mortgage contexts.
They apply the BISG algorithm on these borrowers to generate predicted race probabilities, and they also construct an image-based race proxy using LinkedIn profile images and machine learning models. This image-based measure is used as a benchmark since it correlates more strongly with self-identified race than BISG does.
The results show that BISG performs poorly in identifying Black borrowers—there are twice as many misclassifications as correct classifications. The errors are not random: false negatives (Black borrowers misclassified as non-Black) tend to be more educated or wealthier, while false positives show the opposite profile. As a result, the gap in approval rates between Black and non-Black borrowers is underestimated by 43% when using BISG compared to actual outcomes. For instance, the true approval gap is about 1.8 percentage points, whereas BISG-based measures imply only around 1.1 percentage points. The misclassification effects differ across lenders. Fintechs tend to serve high-income Black borrowers who are often misclassified by BISG. In contrast, small and medium banks may appear more compliant even when they serve fewer Black borrowers than indicated by BISG.
Their formal model shows that regulation based on predicted race (like BISG) leads lenders to respond not to actual race but to prediction probabilities—which means the regulatory incentives are misaligned, potentially distorting approval behavior. Counterfactually, shifting to a system that uses self-identified or image-based race would reduce between-group inequality by more accurately targeting Black borrowers. However, it could also increase within-group inequality, as lending skews toward those in more affluent borrowers.
The paper highlights a critical issue in fair lending enforcement: relying on proxies like BISG can systematically understate racial disparities and inadvertently incentivize regulatory arbitrage. Errors in predicted race are large, structured along socioeconomic lines, and vary by lender type. Switching to actual race measures would better align with fairness goals—but may carry trade-offs like reinforcing inequality within racial groups. Overall, the study underscores the limitations of race proxies in policymaking and research and calls for greater scrutiny and improved data collection.