Prediction of the O-Glycosylation with Secondary Structure Information by Support Vector Machines
説明
Mucin-type O-glycosylation is one of the main types of the mammalian protein glycosylation. It is serine (Ser) or threonine (Thr) specific, though any consensus sequence is still unknown. In this report, support vector machines (SVM) are used for the prediction of O-glycosylation for each Ser or Thr site in the protein sequences. 29 mammalian protein sequences are selected from UniProt8.0, and its structure information is obtained from Protein Data Bank (PDB). A protein subsequence with a prediction target of Ser or Thr site at the center is used as input to SVM, and its amino acid sequence information, and the secondary structure or accessibility, which are calculated by DSSP from PDB data, are encoded as an input data. The results of the preliminary experiments show the effectiveness of the local structure information added to the sequence information.