精品伊人久久大香线蕉,开心久久婷婷综合中文字幕,杏田冲梨,人妻无码aⅴ不卡中文字幕

打開APP
userphoto
未登錄

開通VIP,暢享免費電子書等14項超值服

開通VIP
對CCLE數據庫可以做的分析

轉載:http://www.bio-info-trainee.com/1327.html

收集了那么多的癌癥細胞系的表達數據,拷貝數變異數據,突變數據,總不能放著讓它發霉吧!

這些數據可以利用的地方非常多,但是在谷歌里面搜索引用了它的文章卻不多,我挑了其中幾個,解讀了一下別人是如何利用這個數據的,當然,主要是用那個mRNA的表達數據咯!
這篇文獻對CCLE的數據進行了八個步驟的處理,一個合格的生物信息學分析著完全可以重寫這個過程
step1:Affymetrix U133 Plus2 DNA microarray gene expressions of 27 gastric cancer cell lines (Kato-III, IM95, SNU-620, SNU-16, OCUM-1, NUGC-4, 2313287, HUG1N, MKN45, NCIN87, KE39, AGS, SNU-5, SNU-216, NUGC-3, NUGC-2, MKN74, MKN7, RERFGC1B, GCIY, KE97, Fu97, SH10TC, MKN1, SNU-1, Hs746 T, HGC27) were downloaded from Cancer Cell Line Encyclopedia (CCLE) [16] in March 2013.
step2: Robust Multi-array Average (RMA) normalization was performed. Principal component analysis plot show no obvious batch effect.
step3: The normalized data is then collapsed by taking the probe sets with highest gene expression.
前三步是為了得到27個胃癌相關細胞系的mRNA表達矩陣,方法是下載cel文件,用RMA歸一化,對多探針基因去最大表達量探針!

step4:Unsupervised hierarchical clustering (1-Spearman distance, average linkage) was performed on the cell lines using the aCGH data.

Putative driver genes of which copy number aberrations correlated to mRNA gene expression were identified to determine subtypes or clusters that are driven by different mechanisms. This was done using Mann Whitney U-test with p<0.05, and Spearman Correlation Coefficient test with Rho >0.6.

step5:We then performed consensus clustering[17] on the gene expression data of the 27 gastric cancer cell lines from CCLE using these putative driver genes. We selected k?=?2 as it gives sufficiently stable similarity matrix.

step6: In order to assign new samples to this integrative cluster, significance analysis of microarray (SAM) [18]with threshold q<2.0 was used to generate subtype signature based on the mRNA expression data of the 1762 genes from the 27 gastric cancer cell lines in CCLE.

先用甲基化數據來聚類,得到putative driver genes,然后再用這些基因的表達數據來再次聚類,分成兩類,然后對這兩類進行SAM找差異基因

step7:ssGSEA (single sample GSEA)was used to estimate pathway activities of the gastric cancer cell line in the Molecular Signature Database v3.1 (Msigdb v3.1) [19][20]. The pathway activities are represented in enrichment scores which were rank normalized to [0.0, 1.0]. 
step8:SAM analysis was performed with threshold q<0.2, and fold change >2.0 (for up-regulated pathways), or <0.5 (for down-regulated pathways) to obtain subtype-specific pathways from the 27 gastric cell lines in CCLE.
這里既用來gene set的富集分析,又用來超幾何分布的富集分析,結果去看看這篇文章就知道了!
這篇文章只用了CCLE的一個地方,就是看看不同cancer type里面的某個基因表達boxplot
這個圖的數據用GEOquery可以得到,樣本的分類信息也用GEOquery可以得到,這樣就可以做下面這個圖了,非常簡單
Further, the Cancer Cell Line Encyclopedia (CCLE) database demonstrated that of 1062 cell lines representing 37 distinct cancer types, glioma cell lines express the highest levels of STK17A

結論就是:STK17A is highly expressed in glioma cell lines compared to other cancer types. Data was obtained through the Cancer Cell Line Encyclopedia (CCLE).

第三篇文獻:http://www.nature.com/ncomms/2013/130709/ncomms3126/fig_tab/ncomms3126_F4.html

這篇文獻更簡單了,直接對這個表達矩陣進行聚類:
The 5,000 most variable genes were used for unsupervised clustering of cell lines by mRNA expression data. Cell lines are colour-coded (vertical bars) according to the reported tissue of origin (a PDF version that can be enlarged at high resolution is in Supplementary InformationSupplementary Fig. S4); horizontal labels at bottom indicate the dominating tissue types within the respective branches of the dendrogram. Most ovarian cancer cell lines (magenta) cluster together, interspersed with endometrial cell lines. However, some ovarian cancer cell lines cluster with other tissue types (*). Top right panels: neighbourhoods (1) of the top cell lines in our analysis, (2) of cell line IGROV1, and (3) of cell line A2780. For the ovarian cancer cell lines in these enlarged areas, the histological subtype as assigned in the original publication is indicated by coloured letters.
就直接拿整個表達矩陣即可,然后挑選變異最大的5000個基因來進行聚類,就可以得到類似的圖
本站僅提供存儲服務,所有內容均由用戶發布,如發現有害或侵權內容,請點擊舉報
打開APP,閱讀全文并永久保存 查看更多類似文章
猜你喜歡
類似文章
根據CNV信號對細胞系分組后看表達量差異(這就是多組學的一種方式)
重復一篇WGCNA分析的文章(代碼版)
CCLE:腫瘤細胞系百科全書
X5NiHswlqF9MKN0
緊跟前沿:一種新lncRNA的研究
CCLE數據庫幾個知識點 | 生信菜鳥團
更多類似文章 >>
生活服務
分享 收藏 導長圖 關注 下載文章
綁定賬號成功
后續可登錄賬號暢享VIP特權!
如果VIP功能使用有故障,
可點擊這里聯系客服!

聯系客服

主站蜘蛛池模板: 临汾市| 德庆县| 沁阳市| 日照市| 定安县| 绥化市| 金门县| 和林格尔县| 萝北县| 邵东县| 曲阳县| 铜鼓县| 丁青县| 海盐县| 余姚市| 上饶市| 蓬溪县| 息烽县| 石阡县| 安徽省| 双流县| 腾冲县| 霸州市| 景谷| 莱阳市| 射阳县| 繁昌县| 湖州市| 连山| 遵义县| 开鲁县| 江门市| 吉林省| 库伦旗| 阳新县| 曲沃县| 闽侯县| 错那县| 沧州市| 海盐县| 胶州市|