Creating a Workflow For Expressed Sequence Tags Analysis

Document Type

Conference Proceeding

Publication Date



Computing Sciences and Computer Engineering


Expressed sequence tags (ESTs) are short sequence fragments of genes and may be used in genomic and genetic investigations. Despite rapid expansion of EST generation process, the resulting sequences are relatively low quality fragments and need to be cleaned before assembling into a larger sequence by identifying overlaps between sample sequences. EST comparative analysis and functional assignment then should be performed to characterize gene annotation and classification, and describe gene functions. In this study we reported the establishment of a workflow for analysis and assembly of ESTs sequences into contigs and singlets and implementation of an EST database. High quality assembled ESTs were annotated using BLASTX through our local BLAST server. We searched several databases including the NCBI non-redundant protein databases. The BLAST results were automatically extracted and transferred into a relational database. We used well annotated Gene Ontology (GO) information to characterize gene function annotation and to classify molecular function, biological processes, and cellular communication. Pathway analysis based on Kyoto Encyclopedia of Genes and Genomes (KEGG) classification has been used for pathway mapping. Enzyme commission (EC) numbers were used to determine which sequences pertained to a specific pathway.

Publication Title

Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008

First Page


Last Page


Find in your library