LABEL SEQUENCE LEARNING BASED PROTEIN SECONDARY STRUCTURE PREDICTION USING HYDROPHOBICITY SCALES (Conference Paper)

No Thumbnail Available

Date

2012

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Link

Abstract

Proteins are complex molecules, each comprised of its own combination of twenty different amino acids. Protein secondary structure is a polypeptide that has formed an arrangement of amino acids that are located next to one another in a linear fashion. Protein secondary structure prediction refers to the prediction of the conformational state of each amino acid residue of a protein sequence as one of the three possible states, namely helices, strands, or coils, denoted as H, E, and C, respectively. Protein sequence is the only resource that provides the information to survive denaturing process, so it is essential to find the secondary structure of a protein sequence. The existing methodology uses only one hydrophobicity scale called Kyte-Doolittle whereas in this paper three scales such as, Kyte-Doolittle scale, Hopp-Woods scale and Rose scale are used for protein secondary structure prediction. This Paper formulates secondary structure prediction task as sequence labeling and a new coding scheme is introduced with multiple windows to predict secondary structure of proteins using hydrophobicity scales. Protein sequences with their physical and chemical properties are learned using SVMhmm that creates a learned model, which is then used to predict protein secondary structure of an unknown primary sequence. It is reported 77.11% accuracy based on Q3 measures, when SVMhmm is used.

Description

Keywords

Protein, Protein Secondary Structure, Sequence Labeling Problem, Hydrophobicity Scales

Citation

Endorsement

Review

Supplemented By

Referenced By