[445] Interobserver Agreement in Thyroid FNA (Fine Needle Aspiration) Diagnosis Using the New Thyroid Bethesda System Terminology (TBST) Classification: A Tertiary Care Community Hospital Experience.

Anshu Trivedi, Michael O'Donnell, Mary Fiel-Gan, Saverio Ligato, Theresa Voytek, Srinivas Mandavilli. Hartford Hospital, CT

Background: There is limited literature evaluating the reproducibility and accuracy of the recently proposed TBST in particular the category of follicular lesion/atypical cells of undetermined significance(FLUS/AUS).We recently adopted TBST in our practice and the aim of this study was to evaluate the interobserver variabilility amongst multiple observers in the use of TBST.
Design: A study set of 60 cases of thyroid FNA with surgical pathology (SP) follow-up was created using 47 cases signed out as "cellular follicular lesion with atypia" or any cases flagged as "atypical" and 13 cases of hyperplastic nodule.6 observers (5 pathologists, 1 cytotechnologist, blinded to FNA and final tissue diagnosis) classified these cases according to TBST after a joint session of viewing text and images from TBST Atlas.SP follow-up included:26 cases hyperplastic/colloid nodule (HAN/CN), 24 cases follicular adenoma (FA) and 10 cases papillary thyroid carcinoma (PTC).Data was analyzed for paired interobserver agreement using Cohen's and multiobserver Fleiss' Kappa statistic.Accuracy of each rater was computed against the final SP diagnosis.
Results: Overall interobserver agreement across all TBST diagnostic categories was 0.26 amongst the 6 observers.Interobserver agreement(Cohen's Kappa)amongst pairs of 6 observers ranged from 0.16 to 0.69. Table 1 shows % cases of each observer classified as FLUS/ AUS and their follow-ups including the multiobserver Kappa agreement for each of the three diagnostic categories:

% of cases classified as FLUS/AUS by the 6 observers and surgical follow-up
 Path 1Path 2Path 3Path 4Path 5Path 6Kappa
Table 1

Pairwise Cohen's Kappa for the 6 observers using 3 tier cytology against SP diagnoses were:0.40, 0.14, 0.33, 0.30, 0.28, and 0.41.
Conclusions: 1.Overall interobserver agreement in using TBST in classifying 60 cases amongst the 6 observers was fair. 2.Agreement among observers in the FLUS/ AUS diagnosis against the final SP diagnosis was poor to fair. 3.The correlation of individual TBST diagnosis and final SP follow-up was fair at best. 4.This suggests the need for additional refinement including joint sessions at the multi-headed microscope to improve agreement particularly in the category of FLUS/ AUS.5.Replication using a more extensive test sample and a random set of raters would be useful to quantify random variability in agreement for TBST classification.
Category: Cytopathology

Wednesday, March 2, 2011 1:00 PM

Poster Session VI # 53, Wednesday Afternoon


Close Window