Parametrized stochastic grammars for RNA secondary structure prediction

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced-parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.

Original languageEnglish (US)
Title of host publication2007 Information Theory and Applications Workshop, Conference Proceedings, ITA
Pages256-260
Number of pages5
DOIs
StatePublished - 2007
Event2007 Information Theory and Applications Workshop, ITA - San Diego, CA, United States
Duration: Jan 29 2007Feb 2 2007

Publication series

Name2007 Information Theory and Applications Workshop, Conference Proceedings, ITA

Other

Other2007 Information Theory and Applications Workshop, ITA
Country/TerritoryUnited States
CitySan Diego, CA
Period1/29/072/2/07

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Parametrized stochastic grammars for RNA secondary structure prediction'. Together they form a unique fingerprint.

Cite this