ZIP Code and ZIP Code Tabulation Area Linkage: Implications for Bias in Epidemiologic Research.
Background: To our knowledge, no agreed-upon best practices exist for joining U.S. Census ZIP Code Tabulation Areas (ZCTAs) and U.S. Postal Service ZIP Codes (ZIPs). One-to-one linkage using 5-digit ZCTA identifiers excludes ZIPs without direct matches. "Crosswalk" linkage may match a ZCTA to multiple ZIPs, avoiding losses.
Methods: We compared noncrosswalk and crosswalk linkages nationally and for mortality and health insurance in California. To elucidate selection implications, generalized additive models related sociodemographics to whether ZCTAs contained nonmatching ZIPs.
Results: Nationwide, 15% of ZCTAs had nonmatching ZIPs, i.e., ZIPs dropped under noncrosswalk linkage. ZCTAs with nonmatching ZIPs were positively associated with metropolitan core location, lower socioeconomics, and non-White population. In California, 34% of ZIPs in the mortality and 25% in the health insurance data had ZCTAs with nonmatching ZIPs; however, these ZIPs constitute only 0.03% of total mortality and 0.44% of total insurance enrollees.
Conclusions: Our study findings support the use of crosswalk linkages and ZCTAs as a unit of analysis. One-to-one linkage may cause bias by differentially excluding ZIPs with more disadvantaged populations, although affected population sizes seem small.