__Protein Pocket Similarity Networks__: Predicting gene ontology functions from protein's regional surface structures


* Zhi-Ping Liu, Ling-Yun Wu, Yong Wang, Luonan Chen, and Xiang-Sun Zhang. __[Predicting gene ontology functions from protein's regional surface structures|http://www.biomedcentral.com/1471-2105/8/475]__. ''BMC Bioinformatics'', Vol. 8, 475, 2007. {{([PubMed: 18070366|http://www.ncbi.nlm.nih.gov/pubmed/18070366])}}


This is the supplementary materials webpage for the paper "Predicting Gene Ontology Functions from Protein's Regional Surface Structures". In the paper, we have provided a novel method to predict protein functions (especially the GO terms) from the patterns of surface. You can find the supporting materials or other resources which refer to in the paper. The codes, examples of some of the processing data, the statistics and prediction results also are listed here. Note: You can use and redistribute the code and data if you accept [GNU|http://www.gnu.org/] [General Public License (GPL)|http://www.gnu.org/copyleft/gpl.html].


Pockets or clefts are regarded as the surface concavities which have open mouths accessed by the bulk solution from geometrical perspective of protein surface. The sites of this pattern have strong relationship with functions. We build the Pocket Similarity Networks (PSN) to describe the similarity among these pockets with different thresholds, which are the measurements of the similarity. In order to explore the relationship between the pocket similarity and the GO terms similarity, we statistics the percentage of proteins with similar pockets corresponding to identical GO terms and semantic similarity of the GO functions. Then, the topology structure of the network is used to predict functions. The closest neighbors of pocket and the connected components of the networks would be used to predict a target protein's functions (GO terms) by a scoring scheme. We test the performance of the method by cross-validated experiments on large scale proteins come from different families. We also predicted the GO functions of some un-annotated proteins. 


* Source codes: [PSN_GPL (Python and C++)|PSN/PSN_GPL.rar]
* Statistics: [STAG_GPL|PSN/STA_GPL.rar]
* Predicting: [PRE_GPL|PSN/PRE_GPL.rar]
* Predicted GO terms to unannotated proteins: [GO_GPL|PSN/GO_GPL.rar]

In these files, There are some introduction to the data formats and notes.

Note: This version of the program is in very preliminary stage and is provided just for testing purpose. The program is still under development. Any question, please do not hesitate to contact us. 

Category: [Supplementary] [Software]