Articles Online (Volume 11, Issue 3)


Gene Regulatory Networks in the Genomics Era

Matthew Loose, Roger Patient, Xiangdong Fang, Hongxing Lei

Page 133–134


A Brief Review on the Human Encyclopedia of DNA Elements (ENCODE) Project

Hongzhu Qu, Xiangdong Fang

The ENCyclopedia Of DNA Elements (ENCODE) project is an international research consortium that aims to identify all functional elements in the human genome sequence. The second phase of the project comprised 1640 datasets from 147 different cell types, yielding a set of 30 publications across several journals. These data revealed that 80.4% of the human genome displays some functionality in at least one cell type. Many of these regulatory elements are physically associated with one another and further form a network or three-dimensional conformation to affect gene expression. These elements are also related to sequence variants associated with diseases or traits. All these findings provide us new insights into the organization and regulation of genes and genome, and serve as an expansive resource for understanding human health and disease.

Page 135–141


Computational Identification of Active Enhancers in Model Organisms

Chengqi Wang, Michael Q. Zhang, Zhihua Zhang

As a class of cis-regulatory elements, enhancers were first identified as the genomic regions that are able to markedly increase the transcription of genes nearly 30 years ago. Enhancers can regulate gene expression in a cell-type specific and developmental stage specific manner. Although experimental technologies have been developed to identify enhancers genome-wide, the design principle of the regulatory elements and the way they rewire the transcriptional regulatory network tempo-spatially are far from clear. At present, developing predictive methods for enhancers, particularly for the cell-type specific activity of enhancers, is central to computational biology. In this review, we survey the current computational approaches for active enhancer prediction and discuss future directions.

Page 142–150


Mediator Complex Dependent Regulation of Cardiac Development and Disease

Chad E. Grueter

Cardiovascular disease (CVD) is a leading cause of morbidity and mortality. The risk factors for CVD include environmental and genetic components. Human mutations in genes involved in most aspects of cardiovascular function have been identified, many of which are involved in transcriptional regulation. The Mediator complex serves as a pivotal transcriptional regulator that functions to integrate diverse cellular signals by multiple mechanisms including recruiting RNA polymerase II, chromatin modifying proteins and non-coding RNAs to promoters in a context dependent manner. This review discusses components of the Mediator complex and the contribution of the Mediator complex to normal and pathological cardiac development and function. Enhanced understanding of the role of this core transcriptional regulatory complex in the heart will help us gain further insights into CVD.

Page 151–157


Biomarker Profiling for Lupus Nephritis

Yajuan Li, Xiangdong Fang, Quan-Zhen Li

Lupus nephritis (LN) is one of the most severe manifestations of systemic lupus erythematosus (SLE), which is associated with significant morbidity and mortality of SLE patients. The pathogenesis of LN involves multiple factors, including genetic predisposition, epigenetic regulation and environmental interaction. Over the last decade, omics-based techniques have been extensively utilized for biomarker screening and a wide variety of variations which are associated with SLE and LN have been identified at the levels of genomics, transcriptomics and proteomics. These studies and discoveries have expanded our understanding of the molecular basis of the disease and are important for identification of potential therapeutic targets for disease prediction and early treatment. In this review, we summarize some of the recent studies targeted at the identification of LN-associated biomarkers using genomics and proteomic approaches.

Page 158–165

Original Research

iBIG: An Integrative Network Tool for Supporting Human Disease Mechanism Studies

Jiya Sun, Yuyun Pan, Xuemei Feng, Huijuan Zhang, Yong Duan, Hongxing Lei

Understanding the mechanism of complex human diseases is a major scientific challenge. Towards this end, we developed a web-based network tool named iBIG (stands for integrative BIoloGy), which incorporates a variety of information on gene interaction and regulation. The generated network can be annotated with various types of information and visualized directly online. In addition to the gene networks based on physical and pathway interactions, networks at a functional level can also be constructed. Furthermore, a supplementary R package is provided to process microarray data and generate a list of important genes to be used as input for iBIG. To demonstrate its usefulness, we collected 54 microarrays on common human diseases including cancer, neurological disorders, infectious diseases and other common diseases. We processed the microarray data with our R package and constructed a network of functional modules perturbed in common human diseases. Networks at the functional level in combination with gene networks may provide new insight into the mechanism of human diseases. iBIG is freely available at

Page 166–171

Original Research

Identification of ta-siRNAs and Cis-nat-siRNAs in Cassava and Their Roles in Response to Cassava Bacterial Blight

Andrés Quintero, Alvaro L. Pérez-Quintero, Camilo López

Trans-acting small interfering RNAs (ta-siRNAs) and natural cis-antisense siRNAs (cis-nat-siRNAs) are recently discovered small RNAs (sRNAs) involved in post-transcriptional gene silencing. ta-siRNAs are transcribed from genomic loci and require processing by microRNAs (miRNAs). cis-nat-siRNAs are derived from antisense RNAs produced by the simultaneous transcription of overlapping antisense genes. Their roles in many plant processes, including pathogen response, are mostly unknown. In this work, we employed a bioinformatic approach to identify ta-siRNAs and cis-nat-siRNAs in cassava from two sRNA libraries, one constructed from healthy cassava plants and one from plants inoculated with the bacterium Xanthomonas axonopodis pv. manihotis (Xam). A total of 54 possible ta-siRNA loci were identified in cassava, including a homolog of TAS3, the best studied plant ta-siRNA. Fifteen of these loci were induced, while 39 were repressed in response to Xam infection. In addition, 15 possible cis-natural antisense transcript (cis-NAT) loci producing siRNAs were identified from overlapping antisense regions in the genome, and were found to be differentially expressed upon Xam infection. Roles of sRNAs were predicted by sequence complementarity and our results showed that many sRNAs identified in this work might be directed against various transcription factors. This work represents a significant step toward understanding the roles of sRNAs in the immune response of cassava.

Page 172–181

Original Research

Candidate Biomarker Discovery for Angiogenesis by Automatic Integration of Orbitrap MS1 Spectral- and X!Tandem MS2 Sequencing Information

Mark K. Titulaer

Candidate protein biomarker discovery by full automatic integration of Orbitrap full MS1 spectral peptide profiling and X!Tandem MS2 peptide sequencing is investigated by analyzing mass spectra from brain tumor samples using Peptrix. Potential protein candidate biomarkers found for angiogenesis are compared with those previously reported in the literature and obtained from previous Fourier transform ion cyclotron resonance (FT-ICR) peptide profiling. Lower mass accuracy of peptide masses measured by Orbitrap compared to those measured by FT-ICR is compensated by the larger number of detected masses separated by liquid chromatography (LC), which can be directly linked to protein identifications. The number of peptide sequences divided by the number of unique sequences is 9248/6911 ≈ 1.3. Peptide sequences appear 1.3 times redundant per up-regulated protein on average in the peptide profile matrix, and do not seem always up-regulated due to tailing in LC retention time (40%), modifications (40%) and mass determination errors (20%). Significantly up-regulated proteins found by integration of X!Tandem are described in the literature as tumor markers and some are linked to angiogenesis. New potential biomarkers are found, but need to be validated independently. Eventually more proteins could be found by actively involving MS2 sequence information in the creation of the MS1 peptide profile matrix.

Page 182–194

Application Note

Identification of Candidate Transcription Factor Binding Sites in the Cattle Genome

Derek M. Bickhart, George E. Liu

A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach—using sequence conservation across cattle, human and dog—and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at or

Page 195–198