BACKGROUND

COPD is the third-leading cause of death around the world affecting more than 300 million people,1 and its diagnosis currently relies solely on spirometry criteria; specifically—the ratio of forced expiratory volume in one second (FEV1) to forced vital capacity (FVC): FEV1/FVC < 0.7, often known as the “fixed ratio”.2,3,4 However, the use of only this fixed ratio, which depends on a predictable relationship of FEV1 to FVC, may fail to diagnose COPD, due to differential impacts on the FVC related to deprivation. The long history of spirometry in race-based evaluations of non-white populations and specifically of African-Americans (AA) has been detailed comprehensively by Braun.5,6 In addition, previous work in a variety of settings and populations has found that nutritional and other types of deprivation affect spirometric measures and outcomes.7,8,9,10,11,12,13,14 Systematic failure to diagnose true disease may confuse the understanding of disease manifestations and progression in populations more at-risk of deprivation, as well as perpetuate health disparities.15,16,17

The use of the spirometric fixed-ratio for diagnosis is not dependent on race-specific prediction equations but is a critical race-based issue for medicine because of the potential to mischaracterize lung health for individuals who may self-describe as AA and/or have experienced deprivation.18 In the USA and other societies, race and deprivation are often interwoven, and questions of ancestry, genetics, and disease risk may be incorrectly associated with race instead of recognizing the impact of disadvantage.

Although the diagnosis of COPD is currently determined solely by the spirometric fixed ratio, the disease manifestations of COPD are actually more complex and informative. They include airway inflammation, airflow obstruction, emphysema, air trapping and hyperinflation of the lungs with cough, mucus hypersecretion and sputum production, shortness of breath, and exercise intolerance. The scientific foundation for the fixed-ratio as a definitive diagnosis for COPD beyond simple consensus is limited19 and it has not been assessed across non-white groups. Previous reports have suggested that COPD is less prevalent in AA smokers than NHW, noting higher spirometry values but worse function.20,21,22,23 We hypothesize that this difference in prevalence may be due to misclassification rather than resistance to disease.

The COPDGene study has a large population of AA and NHW participants that have been extensively characterized. Since there is not an accepted alternative diagnostic criterion to define COPD, we assessed participants in the COPDGene study for clinical evidence of disease misclassification related to self-reported race.24 Stepwise, we assessed the distribution of AA participants in the whole cohort based on their initial spirometric category. We evaluated symptoms, structure, function, socioeconomic (SES) characteristics, and outcomes by GOLD stage and 12-year mortality in the whole cohort. We then focused on the “GOLD 0” category (FEV1/FVC > 0.7 and FEV1 > 80% predicted) seeking individuals who had not been diagnosed with COPD under the current standard and using additional criteria for diagnosis within this group based on previous findings of COPD-like symptoms in both our cohort and others.25,26 We evaluated the differences in raw FEV1 and FVC by race category in these “GOLD 0” participants. In order to address the differences in age, and smoking status we developed a subset of the GOLD 0 matched on age, sex, and smoking status. In the matched group we again compared symptoms, structure, function, and outcomes between NHW and AA participants.

METHODS

Ever-Smoker Cohort

The COPDGene study (ClinicalTrials.gov identifier- NCT00608764) has been previously described.27 Briefly, 10,198 ever-smokers were enrolled between 2007–2012 at 21 clinical centers across the United States. Enrollment included general community-based recruitment and enhanced recruitment of subjects with known COPD and planned recruitment of 1/3 African-Americans. Subjects were enrolled based on a history of ≥ 10 pack-years of smoking and absence of chronic lung disease, excepting COPD or a history of asthma; with race self-identified. To facilitate planned genetic analyses, only non-Hispanic white and AA participants were enrolled. COPD and disease severity was characterized after enrollment based on post-bronchodilator spirometry using NHANES race-specific equations. The research protocol was approved by institutional review boards at each clinical center and participants provided signed informed consent.

This analysis involved all current and former smokers who had usable spirometry data (n = 10,132) from Phase 1, analysis of the non-COPD group, GOLD 0 (by current criteria), and a matched analysis of the GOLD 0 participants (n = 2246) (Consort Diagram, Appendix eFigure 1). For all-cause mortality, we used ongoing longitudinal data collection, censored on August 31, 2020.

COPD Disease Score

To characterize participants in COPDGene without relying only on spirometry, we utilized a COPD Disease score, derived from our published work, which included structural changes from CT, symptoms, and functional limitations (Fig. 1).24,25 The potential value of a composite score is that it can capture a broader picture of the illness, especially in a heterogenous disease like COPD. This score is presented as a quantitative metric rather than a validated diagnostic score. See detailed methods in the supplement for the scoring and Appendix eFigure 2.

Figure 1
figure 1

Characterization of COPDGene participants: COPD diagnosis and severity, self-reported race and socioeconomic measures (A–D). Ever-smoking participants were classified at enrollment using Hankinson race-specific prediction equations and the “fixed ratio” criteria FEV1/FVC < 0.7 for COPD. A “Modified GOLD stage” distribution of phase 1 COPDGene subjects by race. The distribution of the cohort is shown by race using a combination of the GOLD classification system for COPD (GOLD 1–4) with two additional groups (PRISm (FEV1/FVC > 0.7, and FEV1 < 80% predicted) and GOLD 0 (FEV1/FVC > 0.7, and FEV1 > 80% predicted). Seventy percent (70%) of the AA enrolled in COPDGene were categorized as PRISm or GOLD 0, while 50% of the NHW participants were categorized as PRISm or GOLD 0. B Self-reported annual income by race. Self-reported income data was collected at Phase 2 visit approximately five years after enrollment (n = 4407 NHW and 1836 AA participants). C Educational level reported by race. The percent distribution of educational levels by self-reported race in the enrollment data collection. D Distribution of Area Deprivation Index (ADI) in non-Hispanic White and African-American participants. Bars represent the number of subjects with specific deciles of index values derived from the 2018 ADI national dataset and geo-coded address information for the participants. The range for ADI is 0–100 and higher values are associated with worse deprivation characteristics. (n = 4368 NHW, 1776 AA for geo-coded data).

COPD Classification

We classified participants based on post-bronchodilator spirometry values, using NHANES race-specific equations, and the most recent Global Initiative for Obstructive Lung Disease (GOLD) criteria.3,4 Because these criteria only address participants with an FEV1/FVC < 0.7, we extended the definitions to those with FEV1/FVC ≥ 0.7 and FEV1% predicted ≥ 80% as “GOLD 0”,25 and those with FEV1/FVC ≥ 0.7, FEV1% predicted < 80% as “PRISm”.28,29 These two groups represent the pool of individuals potentially misclassified.

Socioeconomic Assessment and Area Deprivation Index

We collected self-reported education levels during the enrollment visit, annual income, and participant addresses in Phase 2 of the study. Addresses were geocoded using Arc-GIS (Esri, Redlands CA). Geocoded addresses were assigned a census tract identifier, linked to the 2018 US Census National data files, and n = 6300 participants were assigned an Area Deprivation Index (ADI) score.29 The distribution of these scores was assessed by self-described race. The ADI provides a percentile score (range 0–100, higher scores equal worse deprivation); we selected the median as a cut point (ADI > 50) to compare two groups with potential differential deprivation experiences.30 We also graphed the results by self-reported race and deciles.

GOLD 0 Subset

Spirometry Distributions

Density distribution plots were created for FEV1, FVC, and FEV1/FVC from all GOLD 0 participants.

Matched Case–Control Analysis

We matched subjects from our GOLD 0 group on age (± 3 years), self-reported sex, and smoking status (current vs. former) at the baseline visit. Two groups (AA and NHW) were identified for analysis, n = 1123 each. Comparisons were made for demographics; comorbid diseases; key respiratory symptoms; BODE score, St George’s Respiratory Questionnaire (SGRQ) Total score, socioeconomic factors; CT imaging findings; spirometry (percent-predicted by NHANES race-specific equations and raw post-bronchodilator values for FEV1, FVC, and FEV1/FVC); and COPD Disease Score. DLCO was measured in Phase 2 study visits and percent predicted values were compared after adjusting for hemoglobin and altitude. Disease progression was compared between the two matched groups by calculating the difference between COPD Disease scores for Phase 2 and Phase 1. The change in BODE score between Phase 2 and Phase 1 was also evaluated.

RESULTS

Ever-Smoker Cohort

At enrollment, among all ever-smoking participants (n = 10,198), NHW and AA differed significantly (Table 1). Overall, AA were younger, were more likely to be current smokers, and had fewer pack years. Importantly, using the fixed ratio (FEV1/FVC < 0.7) definition, fewer AA than NHW had COPD (30% vs. 51%, p = 6 × 10–89). When classified, far more AA were GOLD 0 (54% vs. 38%) or PRISm (16% vs. 11%), whereas NHW predominated in all COPD categories (Fig. 2A). A total of 70% of AA were in those two non-COPD diagnostic categories, traditionally excluded from COPD.

Table 1 Baseline Characteristics of COPDGene Smoker Cohort by Self-Reported Race
Figure 2
figure 2

Race and spirometry issues in the COPDGene participants Parts AC. A, B, C Density plots of FEV1, FVC, and the ratio by race in the GOLD 0 only group. Distributions of FEV1, FVC, and FEV1/FVC distributions in COPDGene GOLD 0 subjects with non-Hispanic white (blue line) and African-American (red line) participants depicted. Extensive overlap is seen between the two groups in values with small differences in mean values for both FEV1 and FVC (A and B). FEV1Mean: NHW 2.96 (0.69) liters, AA 2.79 (0.65) liters, [mean (SD)]. Difference between group means = 170 mL. FVC Mean: NHW 3.82 (0.90) liters, AA 3.51 (0.84) liters, [mean (SD)], Difference between group means = 310 mL.

Socioeconomic Factors

AA had multiple indicators of overall lower socioeconomic status, including lower annual income and education. AA participants (n = 1836) reported poverty-level annual income (< $15,000) more frequently (52% vs 18%) than NHW (n = 4407) (Fig. 2B). Only 43% of AA reported education beyond high school compared to 70% of NHW (Fig. 2C). By 2018 census tract data, AA resided in less privileged neighborhoods (ADI 55.8 ± 31.4 vs. 40.2 ± 24.5, mean ± SD, where higher scores are more deprived, p < 0.0001). Moreover, the bimodal distribution of ADI among AA identified a portion with the greatest deprivation (Fig. 2D). By contrast, the ADI distribution of NHW was unimodal.

Mortality

Both by Kaplan Meier plot (eFigure 3) and a Cox proportional hazards model (adjusted for age, age2, gender, current smoking, pack-years, FEV1 in liters at enrollment/height2), race was non-significant as a predictor of mortality over a 12-year period (eTable 1). That result is noteworthy given the significantly younger age and lower cumulative smoking exposure of AA.

GOLD 0 Participants

Spirometry Distributions by Race and Deprivation

Density distribution plots of FEV1, FVC, and their ratio FEV1/FVC, by self-described race for the GOLD 0 participants are shown in Fig. 2A, B, and C. There is extensive overlap in both FEV1 and FVC values for AA and NHW. Means are lower for both FEV1 and FVC in AA, but the difference between NHW and AA is greater for the FVC than FEV1 [ΔFEV1 = 170 ml, Δ FVC = 310 mL]. Due to disproportionately lower FVC, the mean FEV1/FVC ratio for AA is higher. Raw spirometry measurements (FEV1 and FVC) compared by ADI > 50 (50th percentile index value) showed that regardless of race, worse deprivation was associated with lower mean FEV1 (2.79 ± 0.68 vs. 2.88 ± 0.67, mean ± SD, p = 0.0007) and FVC (3.54 ± 0.87 vs. 3.70 ± 0.89, p < 0.0001) and there was a greater impact on the FVC [FEV1 difference = 90 mL, FVC difference = 160 mL].

Matched Case–Control Study

In the matched analysis GOLD 0 participants (Table 2), there were no differences in height, comorbidity score, reports of CAD, and as expected, current smoking, sex, or mean age. AA were slightly heavier by 2.4 kg, had fewer pack years smoking by 1.9 pack years, and were more likely to report the use of respiratory medications (18% vs. 14%). They also reported more dyspnea (MMRC ≥ 2, 35% vs. 21%, p < 0.0001), had worse 6-min walk distances (1356 vs. 1551 feet, p < 0.0001) and had more frequently experienced severe exacerbations (7% vs 5%). SGRQ was higher (worse) in AA for total score, impact, and activity scores (each p < 0.0001) and similar for the symptom scale (25.2 vs 26.6, p = 0.16). Chronic bronchitis was more frequent in the NHW than AA (20% vs 12%).

Table 2 Comparison of Matched Groups of NHW and AA Ever-Smokers from GOLD 0 (No Spirometric Diagnosis of COPD) Group Based on Race-Specific NHANES Prediction Equations at Enrollment

The raw spirometric values of post-bronchodilator FEV1 and FVC in AA were lower than NHW within the matched GOLD 0 group by 0.42 and 0.62 L respectively, but higher as race-adjusted % predicted values (NHANES race-specific equations), and DLCO percent-predicted was worse in AA (78.1 vs 89.6, p < 0.0001). Imaging data using visual scoring of the Phase 1 CT scans shows that AA participants in the GOLD 0 group had significantly more centrilobular (30% vs 25%, p = 0.02) and paraseptal (19% vs 12%, p < 0.0001) emphysema. The BODE score at enrollment (1.03 vs 0.54, p < 0.0001) and also after five years (1.2 vs 0.59, p < 0.0001) was almost double for AA, in this matched cohort. The COPD Disease Score was significantly higher in the AA participants (1.5 vs 1.0, p < 0.0001). They had significantly greater disease progression as measured by the change in disease score (0.39 vs 0.13) and BODE change (0.17 vs 0.05) in Phase 2.

DISCUSSION

We found evidence of misclassification and apparent under-diagnosis of COPD for AA participants in the COPDGene cohort. COPDGene attempted to enroll an appropriate cohort for genetic analysis and subtyping of COPD but found substantial differences in the cohort composition by race and disease classification using the fixed ratio definition. Our AA participants were concentrated (70%) in the “non-COPD” groups, were younger, and yet had equivalent 12-year mortality to the NHW participants. They demonstrated significantly worse socioeconomic metrics. Although the disease classification differences by race may have occurred by chance, this appeared to be a low-probability occurrence that deserved further scrutiny. Key evidence for misclassification that we found was greater disease severity by multiple metrics within a cohort of undiagnosed individuals matched for age, sex, and current smoking. The AA participants manifested extensive respiratory symptoms, worse function, worse DLCO, lower raw FEV1 and FVC but higher ratios. They also evidenced more symptom progression over five years.

The occult effects of deprivation and structural racism on exposures and lung development, differentially impacting FVC more than FEV1, are the postulated mechanism for misclassification. Our finding that within the undiagnosed, presumably not diseased, COPDGene participants in the GOLD 0 group, AA participants demonstrate lower FVC relative to FEV1 (with a consequent higher FEV1/FVC ratio) is consistent with an impact of deprivation on spirometry as shown in other populations. This impact on the ratio leads to a failure to diagnose COPD, differentially by group, and is a serious problem for the validity of the ratio. While we and others have shown that GOLD 0 includes symptomatic individuals as in COPD25,26 and that PRISm patients have significant morbidity, mortality, and risk of progression to worse disease,28,31 individuals are not diagnosed with COPD when they are assigned to those categories. This has consequences for access to care, the legitimacy of needing care, and the provision of healthcare services and raises ethical considerations.32

We show that our NHW and a portion of the AA participants have very different socioeconomic situations, that were likely stable under residential segregation.33 Others have shown that there are associated increases in harmful air pollution and mortality in these settings.34 These current, and likely earlier differences, also include reduced access to quality food sources, increased life stress, and a range of other impacts.16 We believe that the differences in spirometry are not due to race categories but instead reflect the impact of various social factors. The misclassification has occurred along racial lines because, in our society, chronic deprivation more commonly affects AA individuals- likely due to the impacts of structural racism,33 resulting in disproportionate reductions in two key spirometry measures. The assumption that there would be a stable relationship between FEV1 and FVC across all populations, such that a ratio of them could be used for disease diagnosis, appears to be flawed.

The spirometric criterion for COPD diagnosis is FEV1/FVC < 0.7.4 It is a simple dichotomous diagnostic metric, in a heterogeneous disease. Early, historical use of the spirometer focused on demonstrating racial differences, citing lower mean values for vital capacity in African-Americans (and other non-white groups).5,35 Not well-appreciated were the different relative relationships between FVC and FEV1 that have been associated with nutritional and socioeconomic deprivation.7,8,9,10,11,12,13 There are differences of opinion regarding whether developmental effects predispose to later COPD,36 but failure to diagnose the disease systematically, especially in the face of underlying factors such as social and economic deprivation, is of great concern. Further study is needed in the various at-risk populations within the US to understand lung maturation and development following deprivation, and the possible transition to a disease state. However, our data suggests that significant lung disease can be established in smokers well before the ratio drops below 0.7.

Early COPD diagnostic efforts starting in the 1950s emphasized the complexity of pathogenesis and varied manifestations, including a detailed description of emphysema and symptomatology of cough, and dyspnea (Fig. 3).37 Fletcher and Peto’s recognition that they could define a trajectory of FEV1 change in smokers that diverged from non-smokers provided an important utility for spirometry in tracking the disease and diagnosis.38 Over the past 40 years, trends narrowed COPD diagnosis to spirometry-only, possibly enhanced by the simplicity of a single metric and the systematic embrace of number-focused, evidence-based medicine.39 We advocate for broader diagnostic criteria that incorporate symptoms, function, and when possible, imaging.

Figure 3
figure 3

Persistent symptomatology in African-American participants within the matched analysis of the non-COPD GOLD 0 group. Key disease-related findings that remained significantly more frequent in African-American participants in the GOLD 0 group after matching for age, gender, and smoking status include: worse (greater) SGRQ values for activity, impact, and total scales, higher MMRC dyspnea score, and reduced DLCO percent predicted values. (p values for all comparisons are < 0.0001 except for SGRQ Symptom which is non-significant.) In addition, the raw spirometry values are significantly lower (p < 0.0001) in the AA participants compared to the NHW, while the percent predicted values for AA using NHANES race-specific equations are significantly higher (p < 0.0001).

The choice of FEV1/FVC < 0.7 as a disease metric was based on expert opinion and has been questioned in the past.19,40 Limited assessments of its utility have compared it to a lower limit of normal (LLN) metric for the FEV1/FVC ratio and identified that a significant proportion of individuals may be over or underdiagnosed.41 Since the definition of LLN is linked to race-specific prediction equations, this potential alternative metric for the disease is problematic and future work is necessary to more appropriately link spirometry to disease. A recent publication by Bhatt et al. assessed the FEV1/FVC for its ability to predict COPD-related deaths and hospitalizations.42 They found that the 0.7 threshold of the ratio was better than the lower limit of normal, but had only a 66% sensitivity for COPD-related events. They did not report a stratified analysis by race.

Strengths of this study include the large AA cohort with comprehensive participant characterization and longitudinal follow-up. There are also limitations. As a large cohort study, we do not have a randomized sample and have used a matched analysis instead. There may be hidden confounders that impact the results. Some differences that we identify may be statistically significant but not clinically significant, so we have presented variables that appear to have clinical significance as well. Our identification of deprivation as a factor in the low FVC does not exclude the additional possibility of inherited factors within populations or other causes. Limitations also include the lack of early life data that can demonstrate a trajectory of change in the ratio and underlying FEV1 and FVC.

We postulate that over time, the effects of smoking lead to progressive obstructive airway disease, but this progression can be obscured by assigning apparent lung health to a spuriously higher ratio. This results in further confusion as individuals may progress to the PRISm category rather than to a COPD diagnosis under the current classification structure, as shown in Young et al.31.

In summary, the evidence of misclassification for our population of AA participants that a disproportionately low FVC associated with social deprivation is associated with an elevated FEV1/FVC ratio, suggests that the use of the fixed ratio for diagnosis of COPD is not appropriate across the human populations and should be reconsidered as the sole determinant of COPD diagnosis. There is a need for broader diagnostic criteria for COPD to avoid the disease misclassification associated with deprivation-related differences in spirometry, and we have suggested a broader set of measures.24