In:
Cancer Research, American Association for Cancer Research (AACR), Vol. 81, No. 13_Supplement ( 2021-07-01), p. 2312-2312
Abstract:
Background: Lung cancer is the leading cause of cancer death worldwide. Identification of tumor associated genes is an effective approach for early cancer detection, prognosis prediction and target therapy development. With the rapid growth in high-throughput sequencing techniques, big data regarding gene expression of lung cancer has been generated and deposited in public data banks. In this study, we used bioinformatic and statistic techniques to analyze big data from public data banks in identifying novel genes in association with prognosis of non-small cell lung cancer. Methods: Transcriptome data and clinical information of 1020 non-small cell lung cancer (NSCLC) cases were downloaded from The Cancer Genome Atlas (TCGA). After removing the incomplete data, we obtained 1008 NSCLC cases. All the gene expression data was normalized using differential analysis software DESeq2 in R language. The normalized data was further analyzed using weighted gene co-expression network to screen out the differentially expressed tumor associated genes. After applying Protein-Protein Interaction Networks (PPI) analysis in combination with clinical information, we have identified some hub genes. Using univariate/multivariate Cox regression analysis we identified several genes associated with NSCLC prognosis. Results: After DESeq2 analysis, we obtained 2873 differentially expressed genes when the threshold padj & lt; 0.01 and Fold change & gt; 4. Among them, 1955 were upregulated and downregulated genes. When the weighted gene co-expression network analysis was applied, we achieved 50 blocks after data clustering (7 cases were outliers and removed). After linked with the clinical information, we identified that the yellow block contained the most NSCLC related genes. This block contains a total of 2803 genes, and 417 of them were differentially expressed tumor genes. Further analysis using univariate/multivariate Cox regression, we identified SPP1, PKMYT1, PRB11, LINC0116 and PTPRM 5 genes that were associated with poor prognosis of NSCLC. After combined analysis, we found that the 5-year survival rate is 31.3% for early stages of NSCLC and 4.1% with late stages of NSCLC. Conclusion: transcriptome analysis is a good approach in identifying disease associated genes, which can lead to a quick and reliable approach in developing clinical applications. Citation Format: Guoqiang Liang, Jinqiu Sun, Yiwei Liu, Dongxue Liu, Shengxian Liang, Rui Guo, Li Zhong. Identification of tumor associated genes using transcriptome analysis [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2312.
Type of Medium:
Online Resource
ISSN:
0008-5472
,
1538-7445
DOI:
10.1158/1538-7445.AM2021-2312
Language:
English
Publisher:
American Association for Cancer Research (AACR)
Publication Date:
2021
detail.hit.zdb_id:
2036785-5
detail.hit.zdb_id:
1432-1
detail.hit.zdb_id:
410466-3