Articles Online (Volume 9, Issue 1)


Thirty Years of Multiple Sequence Codes

Edward N. Trifonov

An overview is presented on the status of studies on multiple codes in genetic sequences. Indirectly, the existence of multiple codes is recognized in the form of several rediscoveries of Second Genetic Code that is different each time. A due credit is given to earlier seminal work related to the codes often neglected in literature. The latest developments in the field of chromatin code are discussed, as well as perspectives of single-base resolution studies of nucleosome positioning, including rotational setting of DNA on the surface of the histone octamers.

Page 1–6

Review Article

On the Observable Transition to Living Matter

Samanta Pino, Edward N. Trifonov, Ernesto Di Mauro

In recent developments in chemistry and genetic engineering, the humble researcher dealing with the origin of life finds her(him)self in a grey area of tackling something that even does not yet have a clear definition agreed upon. A series of chemical steps is described to be considered as the life-nonlife transition, if one adheres to the minimalistic definition: life is self-reproduction with variations. The fully artificial RNA system chosen for the exploration corresponds sequence-wise to the reconstructed initial triplet repeats, presumably corresponding to the earliest protein-coding molecules. The demonstrated occurrence of the mismatches (variations) in otherwise complementary syntheses (“self-reproduction”), in this RNA system, opens an experimental and conceptual perspective to explore the origin of life (and its definition), on the apparent edge of the origin.

Page 7–14

Review Article

Role of Stable Isotopes in Life—Testing Isotopic Resonance Hypothesis

Roman A. Zubarev

Stable isotopes of most important biological elements, such as C, H, N and O, affect living organisms. In rapidly growing species, deuterium and to a lesser extent other heavy isotopes reduce the growth rate. At least for deuterium it is known that its depletion also negatively impacts the speed of biological processes. As a rule, living organisms “resist” changes in their isotopic environment, preferring natural isotopic abundances. This preference could be due to evolutionary optimization; an additional effect could be due to the presence of the “isotopic resonance”. The isotopic resonance phenomenon has been linked to the choice of earliest amino acids, and thus affected the evolution of genetic code. To test the isotopic resonance hypothesis, literature data were analyzed against quantitative and qualitative predictions of the hypothesis. Four studies provided five independent datasets, each in very good quantitative agreement with the predictions. Thus, the isotopic resonance hypothesis is no longer simply plausible; it can now be deemed likely. Additional testing is needed, however, before full acceptance of this hypothesis.

Page 15–20

Review Article

On the Organizational Dynamics of the Genetic Code

Zhang Zhang, Jun Yu

The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides—adenine, thymine, guanine and cytosine—according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.

Page 21–29

Review Article

EST-Based Identification of Genes Expressed in Skeletal Muscle of the Mandarin Fish (Siniperca chuatsi)

Feng Ding, Wuying Chu, Peng Cui, Meng Tao, Ruixue Zhou, Falan Zhao, Songnian Hu, Jianshe Zhang

To enrich the genomic information of the commercially important fish species, we obtained 5,063 high-quality expressed sequence tags (ESTs) from the muscle cDNA database of the mandarin fish (Siniperca chuatsi). Clustering analysis yielded 1,625 unique sequences including 443 contigs (from 3,881 EST sequences) and 1,182 singletons. BLASTX searches showed that 959 unique sequences shared homology to proteins in the NCBI non-redundant database. A total of 740 unique sequences were functionally annotated using Gene Ontology. The 1,625 unique sequences were assigned to Kyoto Encyclopedia of Genes and Genomes reference pathways, and the results indicated that transcripts participating in nucleotide metabolism and amino acid metabolism are relatively abundant in S. chuatsi. Meanwhile, we identified 15 genes to be abundantly expressed in muscle of the mandarin fish. These genes are involved in muscle structural formation and regulation of muscle differentiation and development. The most remarkable gene in S. chuatsi is nuclease diphosphate kinase B, which is represented by 449 EST sequences accounting for 8.86% of the total EST sequences. Our work provides a transcript profile expressed in the white muscle of the mandarin fish, laying down a foundation in better understanding of fish genomics.

Page 30–36

Review Article

Computational Analysis of Drought Stress-Associated miRNAs and miRNA Co-Regulation Network in Physcomitrella patens

Ping Wan, Jun Wu, Yuan Zhou, Junshu Xiao, Jie Feng, Weizhong Zhao, Shen Xiang, Guanglong Jiang, Jake Y. Chen

miRNAs are non-coding small RNAs that involve diverse biological processes. Until now, little is known about their roles in plant drought resistance. Physcomitrella patens is highly tolerant to drought; however, it is not clear about the basic biology of the traits that contribute P. patens this important character. In this work, we discovered 16 drought stress-associated miRNA (DsAmR) families in P. patens through computational analysis. Due to the possible discrepancy of expression periods and tissue distributions between potential DsAmRs and their targeting genes, and the existence of false positive results in computational identification, the prediction results should be examined with further experimental validation. We also constructed an miRNA co-regulation network, and identified two network hubs, miR902a-5p and miR414, which may play important roles in regulating drought-resistance traits. We distributed our results through an online database named ppt-miRBase, which can be accessed at Our methods in finding DsAmR and miRNA co-regulation network showed a new direction for identifying miRNA functions.

Page 37–44


Identification of Protein-Coding Regions in DNA Sequences Using A Time-Frequency Filtering Approach

Sitanshu Sekhar Sahu , Ganapati Panda

Accurate identification of protein-coding regions (exons) in DNA sequences has been a challenging task in bioinformatics. Particularly the coding regions have a 3-base periodicity, which forms the basis of all exon identification methods. Many signal processing tools and techniques have been applied successfully for the identification task but still improvement in this direction is needed. In this paper, we have introduced a new promising model-independent time-frequency filtering technique based on S-transform for accurate identification of the coding regions. The S-transform is a powerful linear time-frequency representation useful for filtering in time-frequency domain. The potential of the proposed technique has been assessed through simulation study and the results obtained have been compared with the existing methods using standard datasets. The comparative study demonstrates that the proposed method outperforms its counterparts in identifying the coding regions.

Page 45–55

Application Note

TAAPP: Tiling Array Analysis Pipeline for Prokaryotes

Ranjit Kumar , Shane C. Burgess, Mark L. Lawrence, Bindu Nanduri

High-density tiling arrays provide closer view of transcription than regular microarrays and can also be used for annotating functional elements in genomes. The identified transcripts usually have a complex overlapping architecture when compared to the existing genome annotation. Therefore, there is a need for customized tiling array data analysis tools. Since most of the initial tiling arrays were conducted in eukaryotes, data analysis methods are well suited for eukaryotic genomes. For using whole-genome tiling arrays to identify previously unknown transcriptional elements like small RNA and antisense RNA in prokaryotes, existing data analysis tools need to be tailored for prokaryotic genome architecture. Furthermore, automation of such custom data analysis workflow is necessary for biologists to apply this powerful platform for knowledge discovery. Here we describe TAAPP, a web-based package that consists of two modules for prokaryotic tiling array data analysis. The transcript generation module works on normalized data to generate transcriptionally active regions (TARs). The feature extraction and annotation module then maps TARs to existing genome annotation. This module further categorizes the transcription profile into potential novel non-coding RNA, antisense RNA, gene expression and operon structures. The implemented workflow is microarray platform independent and is presented as a web-based service. The web interface is freely available for acedemic use at

Page 56–62