Identification of Protein Expression Signatures in Gastric Carcinomas Using Clustering Analysis
MD Begnami, JH Fregnani, S Nonogaki, C Torres, H Brentani, FA Soares. Hospital AC Camargo, Sao Paulo, Brazil
Background: The identification of gastric carcinomas (GC) has traditionally been based on histomorphology. Recently, DNA microarrays have been used successfully to identify tumors through clustering of the expression profiles. It has been shown that many tumors can be clustered into clinically relevant groups based solely on gene expression profiles. The expression profiles and molecular grouping of GC has been a challenging task because of their complexity and variation. Random forest clustering is attractive for tissue microarray and other immunohistochemical data since it handles highly skewed tumor marker expressions well and weighs the contribution of each marker according to its relatedness with other tumor markers. In this study we identified biologically and clinically meaningful groups of GC by hierarchical clustering analysis of immunohistochemical protein expression.
Design: We selected 28 proteins (p16, p27, p21, cyclin D1, A, B1, pRb, p53, c-met, c-erbB-2, VEGF, TGFI, TGFII, MSH2, bcl-2, bax, bak, bcl-x, APC, clathrin, E-cadherin, -catenin, MUC1, MUC2, MUC5AC, MUC6, MMP2 e MMP9) to be investigated by immunohistochemistry in 482 GC. The data analyses were done using a random forest clustering method (TMEV-http://www.tm4.org/mev.html). It is an unsupervised learning method, which aims to find molecular classifications with distinct global expression profiles blinded to clinicopathological covariates. We used several statistical methods for describing the clusters in terms of clinicopathological variables and tumor marker expression.
Results: Proteins related to cell cycle, growth factor, cell motility, cell adhesion, apoptosis, and matrix remodeling were highly expressed in GC. We identified proteins expressions associated with poor survival in diffuse type of GC including p53 and TGFII. Based on analysis of proteins expressions, a two-way clustering algorithm distinguished two groups (clusters) of GC. We also found that clinicopathological covariates differ across clusters (metastases status and TNM stage). In addition, the clustering analysis identified a cluster of diffuse GC associated with better survival.
Conclusions: Our study identified: 1) two groups of GC that could not be explained by any clinicopathological variables, and 2) a subgroup of long - surviving diffuse GC patients with a distinct molecular profile. These results provide not only a new molecular basis for understanding biological properties of GC, but also better prediction of survival than the classical pathological grouping.
Tuesday, March 10, 2009 1:00 PM
Platform Session: Section B, Tuesday Afternoon