Allison W Welsh, Malini Harigopal, David Rimm. Yale University School of Medicine, New Haven, CT

Background: In efforts to reduce the Estrogen Receptor (ER) false-negative rate (measured by immunohistochemistry (IHC)), new ASCO/CAP guidelines decreased the threshold for ER positivity from 10% of nuclei “positive”, to 1%. However, these guidelines failed to define the threshold of staining intensity, using the term “any immunoreactivity". Here, we assess the variability in staining and interpretation between labs, and examine misclassification as a result of this variability, compared to misclassification as a result of percent-positive threshold (10% versus 1% cutoff).
Design: A retrospective breast cancer tissue microarray (TMA) cohort from Yale consisting of 672 patients was stained for ER in three different CLIA-certified labs in New England all using automated staining machines and FDA cleared antibodies. Each TMA was scored by three observers (two certified pathologists and one student), according to the new ASCO/CAP guidelines, including both a percentage score (%-positive) and intensity score (0-3). Scores were binarized to determine ER status (positive/negative), using both 10% and 1%-positive cutoffs.
Results: Comparing the 10% to 1% cut-off in nine comparisons (3 TMAs X 3 observers), the maximum difference was 3.3% difference and the minimum was 0%. The average difference was 1.1% and none of the differences were statistically significant. We then compared the difference between labs using the current 1%-cutoff which showed the misclassification of ER status between Lab2 and Lab3 to average 15.7% (± 2.8). Between Lab2 versus Lab4 we found an average misclassification rate of 29.1% (± 1.2). For Lab3 versus Lab4 we found a misclassification rate of 17.2% (± 0.7). When examining the misclassified cases, we found the scores for percentage-positive to show roughly even distribution from 5% to 100%-positive cells.
Conclusions: Using current standard IHC methods and the ASCO/CAP guidelines, we have found a highly significant level of misclassification between labs ranging from 15.7% to 29.1%. While a limitation of this study is that it was done on TMAs, the level of misclassification is observed independent of the scoring (10% vs 1% cutoff) guidelines suggesting the observation would be similar on whole tissue sections. If these results are generalizable, between 15 and 30% of patients may be under or over-treated as a function of which lab processes their specimen. We believe these results suggest a need for guidelines for standardization of the “any reactivity” ER-positivity threshold.
