Knowledge Management System of Northwest Institute of Plateau Biology, CAS
Interrogating noise in protein sequences from the perspective of protein-protein interactions prediction | |
Wang, Yongcui2; Ren, Xianwen3,4; Zhang, Chunhua5; Deng, Naiyang1; Zhang, Xiangsun6 | |
2012-12-21 | |
发表期刊 | JOURNAL OF THEORETICAL BIOLOGY |
ISSN | 0022-5193 |
卷号 | 315页码:64-70 |
文章类型 | Article |
摘要 | The past decades witnessed extensive efforts to study the relationship among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. High-throughput experimental methodologies make many model organism's PPIs known, which allows us to apply machine learning methods to learn understandable rules from the available PPIs. Under the machine learning framework, the composition vectors are usually applied to encode proteins as real-value vectors. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vectors. Thus formulation to estimate the noise induced by the background distribution of amino acids may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which were successfully used in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E. coli), Saccharomyces cerevisiae (S. cerevisiae) and human PPIs datasets, surprisingly, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that the noise in phylogenetic tree construction may be valuable information in PPIs prediction. (C) 2012 Elsevier Ltd. All rights reserved.; The past decades witnessed extensive efforts to study the relationship among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. High-throughput experimental methodologies make many model organism's PPIs known, which allows us to apply machine learning methods to learn understandable rules from the available PPIs. Under the machine learning framework, the composition vectors are usually applied to encode proteins as real-value vectors. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vectors. Thus formulation to estimate the noise induced by the background distribution of amino acids may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which were successfully used in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E. coli), Saccharomyces cerevisiae (S. cerevisiae) and human PPIs datasets, surprisingly, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that the noise in phylogenetic tree construction may be valuable information in PPIs prediction. (C) 2012 Elsevier Ltd. All rights reserved. |
关键词 | Bioinformatics Denoising Composition Vector Machine Learning |
WOS标题词 | Science & Technology ; Life Sciences & Biomedicine |
关键词[WOS] | AMINO-ACID-COMPOSITION ; SUBCELLULAR-LOCALIZATION ; INTERACTION NETWORKS ; INFORMATION ; COMPLEXES ; ALIGNMENT ; LOCATION |
收录类别 | SCI |
语种 | 英语 |
WOS研究方向 | Life Sciences & Biomedicine - Other Topics ; Mathematical & Computational Biology |
WOS类目 | Biology ; Mathematical & Computational Biology |
WOS记录号 | WOS:000311194500007 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://210.75.249.4/handle/363003/3574 |
专题 | 中国科学院西北高原生物研究所 |
作者单位 | 1.China Agr Univ, Coll Sci, Beijing 100083, Peoples R China 2.Chinese Acad Sci, NW Inst Plateau Biol, Key Lab Adaptat & Evolut Plateau Biota, Xining 810001, Peoples R China 3.Chinese Acad Med Sci, Inst Pathogen Biol, MOH Key Lab Syst Biol Pathogens, Beijing 100730, Peoples R China 4.Peking Union Med Coll, Beijing 100730, Peoples R China 5.Renmin Univ China, Informat Sch, Beijing 100872, Peoples R China 6.Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Wang, Yongcui,Ren, Xianwen,Zhang, Chunhua,et al. Interrogating noise in protein sequences from the perspective of protein-protein interactions prediction[J]. JOURNAL OF THEORETICAL BIOLOGY,2012,315:64-70. |
APA | Wang, Yongcui,Ren, Xianwen,Zhang, Chunhua,Deng, Naiyang,&Zhang, Xiangsun.(2012).Interrogating noise in protein sequences from the perspective of protein-protein interactions prediction.JOURNAL OF THEORETICAL BIOLOGY,315,64-70. |
MLA | Wang, Yongcui,et al."Interrogating noise in protein sequences from the perspective of protein-protein interactions prediction".JOURNAL OF THEORETICAL BIOLOGY 315(2012):64-70. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
Interrogating noise (502KB) | 开放获取 | -- | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论