Gene annotation using the proteome

149526-Thumbnail Image.png
Description
While the entire human genome has been sequenced, the understanding of its functional elements remains unclear. The Encyclopedia of DNA Elements (ENCODE) project analyzed 1% of the human genome and found that the majority of the human genome is transcribed,

While the entire human genome has been sequenced, the understanding of its functional elements remains unclear. The Encyclopedia of DNA Elements (ENCODE) project analyzed 1% of the human genome and found that the majority of the human genome is transcribed, including non-protein coding regions. The hypothesis is that some of the "non-coding" sequences are translated into peptides and small proteins. Using mass spectrometry numerous peptides derived from the ENCODE transcriptome were identified. Peptides and small proteins were also found from non-coding regions of the 1% of the human genome that the ENCODE did not find transcripts for. A large portion of these peptides mapped to the intronic regions of known genes, thus it is suspected that they may be undiscovered exons present in alternative spliceoforms of certain genes. Further studies proved the existence of polyadenylated RNAs coding for these peptides. Although their functional significance has not been determined, I anticipate the findings will lead to the discovery of new splice variants of known genes and possibly new transcriptional and translational mechanisms.
Date Created
2010
Agent