Articles Online (Volume 1, Issue 1)

Editoral

"Three Kingdoms" to Romance

Jun Yu,Jian Wang,Fuchu He,Huan-Ming Yang

For some historic reasons, our new journal is named "Genomics, Pro teomics & Bioinformatics", or as we have nicknamed it in short the Journal of GPB. A growing number of "-ome" and "-omics" have appeared in many diverse fields of biology, especially in the recent years under profound influences of the Human Genome Project and many other genome projects completed or in progress. We had almost attempted to re-name this journal "Ever-more-omics" to include all the new comers. However, after a second thought, we have decided to entertain these "Three Kingdoms" first while we are keeping an eye on others.
暂无

Page 1


Review Article

Advances in the Study of SR Protein Family

Xiaoyun Ma,Fuchu He

The name of SR proteins is derived from their typical RS domain that is rich in serine (Ser, S) and arginine (Arg, R). They are conserved in evolution. Up to now, 10 members of the SR protein family have been identified in humans. SR proteins contain one or two RNA binding motifs aside from the RS domain, and also possess special biochemical and immunological features. As to the functions of SR proteins, they facilitate the recruitment of the components of splicesome via protein-protein interaction to prompt the assembly of early splicesome; while in alternative splicing, tissue-specifically expressed SR protein along with the relative ratio of SR protein and heterogeneous nuclear ribonucleoprotein (hnRNP) is composed of two main regulative mechanisms for alternative splicing. Almost all of the biochemical functions are regulated by reversible phosphorylation.
暂无

Page 2-8


Review Article

Are Gene Expression Microarray Analyses Reliable? A Review of Studies of Retinoic Acid Responsive Genes

Peter J.van der Sperk,Andreas Kremer, Lynn Murry,Michael G.Walker

Microarray analyses of gene expression are widely used, but reports of the same analyses by different groups give widely divergent results, and raise questions regarding reproducibility and reliability. We take as an example recent published reports on microarray experiments that were designed to identify retinoic acid responsive genes. These reports show substantial differences in their results. In this article, we review the methodology, results, and potential causes of differences in these applications of microarrays. Finally, we suggest practices to improve the reliability and reproducibility of microarray experiments.
null

Page 9-14


Review Article

Azolla - A Model Organism for Plant Genomic Studies

Yin-Long Qiu,Jun Yu

The aquatic ferns of the genus Azolla are nitrogen-fixing plants that have great potentials in agricultural production and environmental conservation. Azolla in many aspects is qualified to serve as a model organism for genomic studies because of its importance in agriculture, its unique position in plant evolution, its symbiotic relationship with the N2-fixing cyanobacterium, Anabaena azollae, and its moderate-sized genome. The goals of this genome project are not only to understand the biology of the Azolla genome to promote its applications in biological research and agriculture practice but also to gain critical insights about evolution of plant genomes. Together with the strategic and technical improvement as well as cost reduction of DNA sequencing, the deciphering of their genetic code is imminent.
null

Page 15-25


Research Article

Gene Identification and Expression Analysis of 86,136 Expressed Sequence Tags (EST) from the Rice Genome

Yan Zhou,Jiabin Tang,Michael G.Walker,Xiuqing Zhang,Songnian Hu,Huayong Xu,Yajun Deng,Jianhai Dong,Lin Ye,Jun Li,Xuegang Wang,Hao Xu,Yibin Pan,Wei Lin,Wei Tian,Jing Liu,Liping Wei,Siqi Liu,Huan-Ming Yang,Jun Yu,Jian Wang

Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Avabidopsis according to KEGG. We further profiled gene expression patterns in different tis sues, developmental stages, and in a conditional sterile mutant, after checking the libraries are comparable by means of sequence coverage. We also identified some possible library specific genes and a number of enzymes and transcription factors that contribute to rice development.
null

Page 26-42


Research Article

A Statistical Approach Designed for Finding Mathematically De fined Repeats in Shotgun Data and Determining the Length Distri bution of Clone-Inserts

Lan Zhong,Kunlin Zhang,Xianggang Huang,Peixiang Ni,Yujun Han,Kai Wang,Jun Wang,Songgang Li

The large amount of repeats, especially high copy repeats, in the genomes of higher animals and plants makes whole genome assembly (WGA) quite difficult. In order to solve this problem, we tried to identify repeats and mask them prior to assembly even at the stage of genome survey. It is known that repeats of different copy number have different probabilities of appearance in shotgun data, so based on this principle, we constructed a statistical model and inferred criteria for mathematically defined repeats (MDRs) at different shotgun coverages. According to these criteria, we developed software MDRmasker to identify and mask MDRs in shotgun data. With repeats masked prior to assembly, the speed of assembly was increased with lower error probability. In addition, clone-insert size affects the accuracy of repeat assembly and scaffold construction. We also designed length distribution of clone-inserts using our model. In our simulated genomes of human and rice, the length distribution of repeats is different, so their optimal length distributions of clone-inserts were not the same. Thus with optimal length distribution of clone-inserts, a given genome could be assembled better at lower coverage.
null

Page 43-51


Letter

Gene Expression Versus Sequence for Predicting Function: Glia Maturation Factor Gamma Is Not A Glia Maturation Factor

Michael G.Walker

It is standard practice, whenever a researcher finds a new gene, to search databases for genes that have a similar sequence. It is not standard practice, whenever a researcher finds a new gene, to search for genes that have similar expression (coexpression). Failure to perform co-expression searches has lead to incorrect conclusions about the likely function of new genes, and has lead to wasted laboratory attempts to confirm functions incorrectly predicted. We present here the example of Glia Maturation Factor gamma (GMF-gamma). Despite its name, it has not been shown to participate in glia maturation. It is a gene of unknown function that is similar in sequence to GMF-beta. The sequence homology and chromosomal location led to an unsuccessful search for GMF-gamma mutations in glioma.We examined GMF-gamma expression in 1432 human cDNA libraries. Highest expression occurs in phagocytic, antigen-presenting and other hematopoietic cells.We found GMF-gamma mRNA in almost every tissue examined, with expression in nervous tissue no higher than in any other tissue. Our evidence indicates that GMF-gamma participates in phagocytosis in antigen presenting cells. Searches for genes with similar sequences should be supplemented with searches for genes with similar expression to avoid incorrect predictions.
null

Page 52-57


Letter

Proteomic Comparison of Two-Dimensional Gel Electrophoresis Pro files from Human Lung Squamous Carcinoma and Normal Bronchial Epithelial Tissues

Cui Li,Xianquan Zhan,Maoyu Li,Feng Li,JianLing Li,Zhiqiang Xiao,Zhuchu Chen,Xueping Feng,Ping Chen,Jingyun Xie,Songping Liang

Differential proteome profiles of human lung squamous carcinoma tissue compared to paired tumor-adjacent normal bronchial epithelial tissue were established and analyzed by means of immobilized pH gradient-based two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) and matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF-MS). The results showed that well-resolved, reproducible 2-DE patterns of human lung squamous carcinoma and adjacent normal bronchial epithelial tissues were obtained under the condition of 0.75-ug protein-load. The average deviation of spot position was 0.733+0.101 mm in IEF direction, and 0.925+0.207 mm in SDS-PAGE direction. For tumor tissue, a total of 1241±88 spots were detected, 987±65 spots were matched with an average matching rate of 79.5%. For control, a total of 1190+72 spots were detected, and 875±48 spots were matched with an average matching rate of 73.5%. A total of 864±34 spots were matched between tumors and controls.Forty-three differential proteins were characterized: some proteins were related to oncogenes, and others involved in the regulation of cell cycle and signal transduction. It is suggested that the differential proteomic approach is valuable for mass identification of differentially expressed proteins involved in lung carcinogenesis.These data will be used to establish human lung cancer proteome database to further study human lung squamous carcinoma.
null

Page 58-67


Letter

Finding Signals for Plant Promoters

Weimou Zheng

The strongest signal of plant promoter is searched with the model of single motif with two types. It turns out that the dominant type is the TATA-box. The other type may be called TATA-less signal, and may be used in gene finders for promoter recognition. While the TATA signals are very close for the monocot and the dicot, their TATA-less signals are significantly different. A general and flexible multi-motif model is also proposed for promoter analysis based on dynamic programming. By extending the Gibbs sampler to the dynamic programming and introducing temperature, an efficient algorithm is developed for searching signals in plant promoters.
null

Page 68-73


Method

A Real-Time and Dynamic Biological Information Retrieval and Analysis System (BIRAS)

Qi Zhou,Hong Zhang,Meiying Geng,Chenggang Zhang

The aim of this study is to design a biological information retrieval and analysis system (BIRAS) based on the Internet. Using the specific network protocol, BIRAS system could send and receive information from the Entrez search and retrieval system maintained by National Center for Biotechnology Information (NCBI) in USA. The literatures, nucleotide sequence, protein sequences, and other resources according to the user-defined term could then be retrieved and sent to the user by pop up message or by E-mail informing automatically using BIRAS system.All the information retrieving and analyzing processes are done in real-time. As a robust system for intelligently and dynamically retrieving and analyzing on the user-defined information, it is believed that BIRAS would be extensively used to retrieve specific information from large amount of biological databases in now days.The program is available on request from the corresponding author.
none

Page 74-77


Method

A Novel Approach for Identifying the Heme-Binding Proteins from Mouse Tissues

Xiaolei Li,Xiaoshan Wang,Kang Zhao,Zhengfeng Zhou,Caifeng Zhao,Ren Yan,Liang Lin,Tingting Lei,Jianling Yin,Rong Wang,Zhongsheng Sun,Zuyuan Xu,Jingyue Bao,Xiuqing Zhang,Xiaoli Feng,Siqi Liu

Heme is a key cofactor in aerobic life, both in eukaryotes and prokaryotes. Because of the high reactivity of ferrous protoporphyrin IX, the reactions of heme in cells are often carried out through heme-protein complexes. Traditionally studies of hemebinding proteins have been approached on a case by case basis, thus there is a limited global view of the distribution of heme-binding proteins in different cells or tissues. The procedure described here is aimed at profiling hemne-binding proteins in mouse tissues sequentially by 1) purification of heme-binding proteins by hemeagarose, an affinity chromatographic resin; 2) isolation of heme-binding proteins by SDS-PAGE or two-dimensional electrophoresis; 3) identification of heme-binding proteins by mass spectrometry. In five mouse tissues, over 600 protein spots were visualized on 2DE gel stained by Commassie blue and 154 proteins were identified by MALDI-TOF, in which most proteins belong to heme related. This methodology makes it possible to globally characterize the heme-binding proteins in a biological system.
none

Page 78-86