Interlaboratory variation in results from immunohistochemical assessment of estrogen receptor status

Document Type

Letter to the Editor

Date of Publication


Publication Title

The Breast Journal

First Page


Last Page



US and Canadian Academy of Pathology 2002 Annual Meeting


To the Editor:

Analysis of steroid hormone receptors (estrogen receptor (ER) and progesterone receptor (PR)) has become a standard component in the pathologic examination of breast carcinoma specimens. Significant variability exists among laboratories as to antibody clone and titer, quantification technique, and cut point used to characterize a breast carcinoma as positive, borderline, or negative for ER (1). European studies have shown variability in results of ER analysis among laboratories (2-4). Less information exists for practice characteristics in the United States as to the degree of agreement among laboratories immunohistochemically analyzing material for ER.

In order to study variability among laboratories due to differences in immunohistochemical quantitation methods and cut points, we obtained 35 paraffin blocks containing breast adenocarcinoma. Three pairs of sequential sections were cut from each block and one was sent by overnight mail to each of three institutions. The institutions were instructed to perform ER analysis using their usual methodology, quantification techniques, cut points, and reporting protocols. Antigen retrieval steps and immunohistochemical staining were performed on the same day by each of the three participating institutions. Results were tabulated and interinstitutional correlation made. Two of the three institutions utilized computer‐assisted microdensitometry (Cell Analysis System (CAS), Becton Dickinson, Franklin Lakes, NJ), while the remaining institution used manual estimation.

In nine cases (26%), significant disagreement occurred in which one of the examining institutions assayed ER as positive and the remaining institutions considered the tumor to be ER negative or a single institution reported a negative result and the other institutions considered the tumor to be ER positive. In five of the nine cases, the two institutions using the CAS did not agree and in three cases the disagreement was between the site using manual estimation and the CAS sites. Minor disagreements, defined as at least one site assaying the carcinoma as borderline for ER and the other site designating it as either positive or negative for ER occurred in an additional nine cases. In four of these the sites using computer‐assisted microdensitometry agreed, whereas in five instances the two CAS sites disagreed.

In 32 cases, the percentage of nuclei staining for ER was available from all three laboratories. When a single cut point separating positive from negative was used uniformly, nine cases (28%) showed disagreement among the three laboratories as to whether a case was positive or negative.

The immunohistochemical method for ER quantification and the CAS quantification method have been shown to compare favorably (5, 6) with the dextran coated charcoal (DCC) method. However, little consistency exists among laboratories regarding antibody clone, titer, quantitation technique, or cut point utilized among laboratories using immunohistochemistry for the analysis of ER (1). Despite such variability, clinicians treating breast disease often base important therapeutic decisions on results reported as “positive” or “negative” (1). Wide variability in technique may not necessarily result in clinically significant interlaboratory variability in results. However, our study indicates interlaboratory variation is significant in that a major disagreement in defining the tumor as positive or negative for ER occurred in 26% of cases. This variability occurred when each laboratory used its standard technique and cut points. Depending on which laboratory was utilized by the clinician, he or she might treat with or withhold Tamoxifen therapy based on the ER results. While a greater degree of interlaboratory agreement would most likely have been achieved if identical antibody clones, titers, methods of quantification, and cut points had been used, such standardization does not reflect current medical practice. Hence this study mirrors the degree of variability that is to be expected between laboratories performing routine ER analyses. Multiple factors can explain interlaboratory variation in ER analysis results including differences in fixation, antigen retrieval, antibody clone, antibody titer, quantification technique, and cut points employed. Rhodes et al. (4) showed that variations in fixation and tissue preparatory methods are not significant contributory factors in preventing optimal demonstration of ER in routine material.

Important variables in the present study include antigen retrieval methodology, antibody clone and titer, quantitation methods, and cut points utilized. Rhodes et al. (2) showed antigen retrieval methodology to be an important variable in assessment of ER levels. In our study, the antigen retrieval methods differed between the participating laboratories. Two laboratories used microwave retrieval, the third an electric pressure cooker. No greater agreement between the laboratories using microwave processing was found than between either of those laboratories and the laboratory using an alternate heating method.

While antibody clone and titer could be an important cause of laboratory variation, our study would tend to discount this as a predominant cause of the observed variability. Two laboratories used the identical antibody clone (1D5). They showed no greater interlaboratory agreement than they did with the laboratory using a different antibody clone and titer. In a prior study (5), antibody clone produced statistically significant variation in measured ER level when three commercially available clones were compared.

Quantification method might be an important source of interlaboratory variability. In our study only two methods of quantification were used. Two groups utilized a microdensitometry methodology employing the CAS instrument and one institution manually estimated the percent of nuclei staining. Of interest is that less agreement was seen between the results obtained by the two institutions utilizing the CAS method than was found between the manual method and one of the institutions utilizing CAS. Computer‐assisted microdensitometry did not improve interlaboratory agreement over manual estimates of the percent of nuclei staining.

Cut points defining negative, borderline, and positive varied substantially between the three laboratories. Despite these differences, laboratories A and C agreed on the designation of ER as positive or negative in 60% of cases, while laboratories C and A agreed with laboratory B in 59% and 62% of cases, respectively. There did not appear to be any trend toward one institution more frequently having the dissenting assay result.

Even when uniform cut points and quantification techniques were used, no significant improvement in overall agreement was achieved (28% disagreement versus an original value of 26% disagreement). Hence differences in cut‐point criteria were not exclusively responsible for the variability of results observed.

Our study shows that considerable variability exists between ER results obtained by different laboratories on the same tissue blocks. The differences we observed are of a sufficient magnitude (60% concordance) to result in clinically significant differences in therapy. While some clinicians may treat patients with Tamoxifen regardless of ER level, a prior survey (1) showed that many clinicians used ER/PR results to determine if they will treat with Tamoxifen. Until standard techniques and cut points are established, it would appear prudent for laboratories to state their antigen retrieval technique, antibody clone and titer, and cut point utilized when reporting their ER results.