TY - JOUR
T1 - Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data.
AU - Li, H.
AU - Li, J.
AU - Tan, S. H.
AU - Ng, S. K.
PY - 2004
Y1 - 2004
N2 - Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.
AB - Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.
UR - http://www.scopus.com/inward/record.url?scp=2442704465&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=2442704465&partnerID=8YFLogxK
M3 - Article
C2 - 14992513
AN - SCOPUS:2442704465
SN - 2335-6936
SP - 312
EP - 323
JO - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
JF - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
ER -