[829] Digital Slides for Standardization of Gleason Grading by International Experts

L Egevad, F Algaba, DM Berney, L Boccon-Gibod, E Comperat, AJ Evans, R Grobholz, G Kristiansen, C Langner, G Lockwood, A Lopez-Beltran, R Montironi, P Oliveira, M Schwenkglenks, B Vainer, M Varma, V Verger, P Camparo. Karolinska Inst, Stockholm, Sweden; Fundacio Puigvert-Univ, Barcelona, Spain; St Barths Hospital, London, United Kingdom; Armand Trousseau, Paris, France; Pitié-Salpetrière, Paris, France; Univ Toronto, Toronto, Canada; Saarland Univ Hospital, Homburg, Germany; Univ Hospital, Zurich, Switzerland; Medical Univ, Graz, Austria; CPAC, Toronto, Canada; Cordoba Univ, Cordoba, Spain; Polytechnic Univ Marche Region, Ancona, Italy; Hospital da Luz, Lisboa, Portugal; Univ Hospital, Basel, Switzerland; Rigshospitalet, Copenhagen, Denmark; Univ Hospital, Cardiff, United Kingdom; CCITI, Dijon, France; Hopital Foch, Paris, France

Background: Our aims were to analyze reporting of GP 3 and 4 when using the ISUP 2005 revision of Gleason grading, to identify interpretation difficulties, and to collect consensus cases for standardization.
Design: A set of 25 NBX cores diagnosed as GS 6-7 cancer were scanned. Only GS 6 cases that were borderline to GS 7 were included. 15 uropathology experts graded the digital slides and encircled any GP 4 and 5 in the slide reader. Grading difficulty was scored as 1-3. GP 4 components were classified as Type 1 (cribriform), 2 (fused) or 3 (poorly formed glands). After individual review, the experts met to anlyze diagnostic difficulties and agree on a set of consensus cases.
Results: A GS 5-6, 7 (3+4), 7 (4+3), 8-9 was given in 29%, 41%, 19% and 10% (mean GS 6.84, range 6.44-7.36). In 15 cases, at least 67% of observers agreed on GS groups. Mean weighted kappa of interobserver reproducibility for GS and GS groups was 0.346 and 0.429. A difficulty score of 1, 2 and 3 was given in 58%, 32% and 10%. Means in consensus and non-consensus cases were 1.44 and 1.66 (p = 0.003). When a GP 4 was reported, Types 1, 2 and 3 were seen in 28%, 86% and 67%. Types 2 and 3 were found together in 41% and all 3 co-existed in 16% (11% and 23% in consensus and non-consensus cases, p = 0.03). Average estimated and calculated %GG4/5 were 29% and 16%. Areas of GP 4 and 5 were displayed as heat maps with overlaying regions. The maps were helpful for identifying contentious areas. A key problem was to agree on minimal criteria for small foci of GP 4.
Conclusions: There is still considerable disagreement among experts on how to report borderline GS 6-7 cases. The detection threshold for minimal foci of GP 4 in NBX needs to be better defined.
