Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog

Document Type


Publication Date



Biological Sciences


The post-genomic era has seen a significant increase in the use of computational prediction methods to gain insights into structure and function of proteins. Prediction tools are used to guide the experimental design to test various hypotheses about structure and function of known proteins. However, these tools are particularly useful when studying putative protein sequences with no known function. The genomic era produced a large number of sequences that are described as either hypothetical proteins or as proteins with unknown function. Current molecular biology techniques are not adequate to efficiently study this vast reservoir of genetic information. However, computer algorithms can process large amounts of sequence data to predict structure and function. These knowledge-based computational tools use available experimental data and are regularly updated to improve their predictive power. The simplest form of function prediction is achieved by comparison of the query sequence to all available sequences using BLAST. If the query sequence is highly similar to previously characterized proteins, then it is likely that the query sequence has similar functions. However, if the query sequence does not have any homologous sequence with known function, then more sophisticated computational tools are necessary to gain insight into structure and function. Various methods have been developed to search for known domains, motifs, patterns, or profiles. The quality of predictions is dependent on the type of tools used and is limited to the closeness of the query sequence to known proteins.

In this chapter, we will describe and discuss methods and tools we used to predict structure and function of a putative protein sequence (Msa) with unknown function. We will address the advantages and limitations of all these approaches by using the Msa protein from the human pathogen Staphylococcus aureus as a case study. Msa is a novel protein that is involved in regulation of virulence. Since Msa has no known homolog, computational tools are being used to predict its structure and mechanism of action. These predictions are used to design experiments to study Msa and explore its use as a therapeutic target to combat antibiotic-resistant infections.

Publication Title

Computational Intelligence in Biomedicine and Bioinformatics

First Page


Last Page