|
We use computational methods to find out something new about the structure, function, and evolution of genes, proteins, and entire genomes. One goal of our research is to develop a framework for the analysis of genome-wide numeric data. These days, we can measure a lot of things, and each gene in a completely sequenced genome can be associated with a series of numbers, for example: the level of transcripts of this gene under various conditions or treatments (expression vector); presence and absence of orthologs of that gene in other complete genomes (phyletic vector); information about proteins that interact with a given gene product (interaction vector); occupancy of all known cellular compartments and sites by the product of a gene (localization vector) and others.
We are studying the ways of measuring the distances between vectors, and also developing procedures for sensitive finding of similar vectors. Now we can find groups of related vectors and study their higher-order organization, such as hierarchical or network properties. Surprisingly, the known biochemical and signaling pathways and multiprotein complexes are almost never found as one homogeneous cluster – whatever the vector space is – and most of them are split into many clusters in each vector space, indicating that pathways are modular and that losses and gene displacements have occurred in the evolution of most pathways.
Another direction of research is comparative analysis of completely sequenced genomes and their gene products. Using the information on sequence and structure similarities in existing proteins, we are trying to reconstruct the pathways and complexes that may have existed in the ancestors of the modern life forms. We also have been involved in several genome annotation projects, such as comparative genomics of DNA-containing bacteriophages, functional annotation of predicted proteins encoded by the genome in sea urchin S. purpuratus, and analysis of repertoire of small RNAs encoded by several eukaryotes.
We are also interested in phylogenetic inference in the presence of massive horizontal gene transfer between genomes. Bacteriophages are a good model to study this difficult problem because of fast sequence drift and lack of omnipresent genes in phage genomes. We compiled the profiles of presence and absence of orthologous genes in completely sequenced phages and used these gene content vectors to infer the evolutionary history of phages with double-stranded DNA genomes. Conflicts between this phylogeny and trees constructed from sequence alignments of phage proteins were exploited to infer specific acts of intergenome gene transfer. Thus, a notoriously reticulate evolutionary history of fast-evolving phages can be reconstructed in considerable detail by quantitative comparative genomics. On a more general note, the high absolute number of acts of horizontal transfer is in sharp contrast with relatively low proportion of genes that have been transferred more than once in their lifetime.
Academic Appointment: Professor, Department of Microbiology, Molecular Genetics & Immunology, The University of Kansas School of Medicine
Bioinformatics Center Selected Publications
Book
Mushegian AR. Foundations
of Comparative Genomics. Amsterdam ; Boston:
Academic Press; 2007. Amazon. Bibliographic Record.
Articles
Baumann D, Cook M, Ma L, Mushegian A, Sanders E,
Schwartz J. A family of GFP-like proteins with different spectral
properties in lancelet Branchiostoma floridae. Biol Direct.
2008 July 3;3:28.
Abstract.
Li H, Zhu D, Cook M. A statistical framework for consolidating "sibling"
probe sets for Affymetrix GeneChip data. BMC Genomics 2008 Apr 24;9:188. Abstract.
Mushegian AR. Gene content of LUCA, the last universal
common ancestor. Frontiers in Bioscience 4657-4666, May 1, 2008. Abstract.
Minakhin L, Goel M, Berdygulova Z, Ramanculov E, Florens L, Glazko G,
Karamychev VN, Slesarev AI, Kozyavkin SA, Khromov I, Ackermann HW, Washburn
M, Mushegian A, Severinov K. Genome comparison and proteomic
characterization of Thermus thermophilus bacteriophages P23-45 and P74-26:
siphoviruses with triplex-forming sequences and the longest known tails. J
Mol Biol 2008 Apr 25;378(2):468-80. Epub 2008 Feb
15. Abstract.
Savalia D, Westblade L, Goel M, Florens L, Kemp P, Akulenko N, Pavlova
O, Padovan J, Chait B, Washburn M, Ackermann HW, Mushegian A, Gabisonia
T, Molineux I, Severinov K. Genomic and proteomic analysis of phiEco32, a novel
Escherichia coli bacteriophage. J Mol Biol 2008 Mar 28;377(3):774-89. Epub 2008 Jan 11. Abstract.
Jones N, Lynn M, Gaudenz K, Sakai D, Aoto K, Rey J, Glynn E, Ellington L,
Du C, Dixon J, Dixon M, Trainor P. Prevention of the neurocristopathy Treacher
Collins syndrome through inhibition of p53 function. Nat Med. 2008 Feb;14(2):125-33. Epub 2008 Feb 3. Abstract
Roux M, Radeke M, Goel M, Mushegian A, Foltz K. 2DE identification of
proteins exhibiting turnover and phosphorylation dynamics during sea urchin egg
activation. Dev Biol. 2008 Jan 15;313(2):630-47.
Epub 2007 Nov 13. Abstract.
Glazko G, Makarenkov V, Liu J, Mushegian A. Evolutionary history
of bacteriophages with double-stranded DNA genomes. Biol
Direct. 2007 Dec 6;2:36. Abstract.
Li H, Gao G, Li J, Page G, Zhang K. Detecting epistatic interactions
contributing to human gene expression using the CEPH family data. BMC Proceedings. 2007, 1(Suppl 1):S67. Article.
Zhu D, Hero AO 3rd. Bayesian hierarchical model for large-scale
covariance matrix estimation. J Comput Biol. 2007 Dec;14(10):1311-26.
Abstract.
Pan L, Chen S, Weng C, Call G, Zhu D, Tang H, Zhang N, Xie T. Stem cell
aging is controlled both intrinsically and extrinsically in the Drosophila
ovary. Cell Stem Cell. 1, 458-469, October 2007. Article.
Emmert-Streib F. The
chronic fatigue syndrome: a comparative pathway analysis. J Comput Biol.
2007;14:961-972. Abstract
Emmert-Streib F, Mushegian A. A topological
algorithm for identification of structural domains of proteins. BMC Bioinformatics. 2007;8:237.
Abstract
Zhu D, Li Y, Li H. Multivariate
correlation estimator for inferring functional relationships from replicated
genome-wide data. Bioinformatics. 2007;23:2298-2305. Abstract
Sandell LL, Sanderson BW, Moiseyev G, Johnson T, Mushegian A, Young K, Rey JP, Ma JX, Staehling-Hampton K, Trainor
PA. RDH10 is essential for synthesis of embryonic retinoic acid and is required
for limb, craniofacial, and organ development. Genes Dev. 2007;21:1113-1124. Abstract
Banks CA, Kong SE, Spahr H, Florens L, Martin-Brown S, Washburn MP,
Conaway JW, Mushegian A, Conaway RC.
Identification and Characterization of a Schizosaccharomyces pombe RNA Polymerase II Elongation Factor with
Similarity to the Metazoan Transcription Factor ELL. J Biol Chem.
2007;282:5761-5769. Abstract
Paoletti AC, Parmely TJ, Tomomori-Sato C, Sato S, Zhu D, Conaway RC, Conaway JW, Florens
L, Washburn MP. Quantitative proteomic analysis of distinct mammalian Mediator
complexes using normalized spectral abundance factors. Proc Natl Acad Sci U
S A. 2006;103:18928-18933. Abstract
Dequeant ML, Glynn E, Gaudenz K,
Wahl M, Chen J, Mushegian A, Pourquie O. A complex oscillating network of signaling
genes underlines the mouse segmentation clock. Science.
2006;314:1595-1598. Abstract
Naryshkina T, Liu J, Florens
L, Swanson SK, Pavlov AR, Pavlova NV, Inman R, Minakhin L, Kozyavkin SA,
Washburn M, Mushegian A, Severinov
K. Thermus thermophilus Bacteriophage varphiYS40 Genome and Proteomic
Characterization of Virions. J Mol Biol. 2006;364:667-677.
Abstract
Bradham CA, Foltz KR, Beane WS, Arnone MI, Rizzo F, Coffman JA, Mushegian A, Goel M, Morales J,
Geneviere AM, Lapraz F, Robertson AJ, Kelkar H, Loza-Coll M, Townley IK, Raisch
M, Roux MM, Lepage T, Gache C, McClay DR, Manning G. The sea urchin kinome: A
first look. Dev Biol. 2006;300:180-193. Abstract
Goel M, Mushegian A. Intermediary
metabolism in sea urchin: The first inferences from the genome sequence. Dev
Biol. 2006;300:282-292. Abstract
Florens L, Carozza MJ, Swanson SK, Fournier M, Coleman MK, Workman JL, Washburn MP. Analyzing
Chromatin Remodeling Complexes Using Shotgun Proteomics and Normalized Spectral
Abundance Factors. Methods. 2006;40:303-311. Abstract
The Sea Urchin Genome
Consortium. The genome of the sea urchin
Strongylocentrotus purpuratus. Science.
2006;314:941-952. Abstract
Zybailov B, Mosley AL, Sardiu ME, Coleman MK, Florens L, Washburn MP.
Statistical analysis of membrane proteome expression changes in Saccharomyces
cerevisiae. J Proteome Res. 2006;5:2339-2347.
Abstract
Jaspersen SL, Martin AE, Glazko G,
Giddings TH, Jr., Morgan G, Mushegian A,
Winey M. The Sad1-UNC-84 homology domain in Mps3 interacts with Mps2
to connect the spindle pole body with the nuclear envelope. J Cell Biol.
2006;174:665-675. Abstract
Rai R, Mushegian A,
Makarova K, Kashina A. Molecular dissection of arginyltransferases guided by similarity
to bacterial peptidoglycan synthases. EMBO Rep. 2006;7:800-805. Abstract
Glazko G, Coleman M, Mushegian A.
Similarity searches in genome-wide numerical data sets. Biol
Direct. 2006;1:13. Abstract.
Supplementary Information.
Liu J, Glazko G, Mushegian A. Protein repertoire of double-stranded DNA bacteriophages. Virus Res. 2006;117:68-80. Abstract.
Glynn EF, Chen J, Mushegian AR.
Detecting periodic patterns in unevenly spaced gene expression time series
using Lomb-Scargle periodograms. Bioinformatics.
2006;22:310-316. Abstract.
Supplementary Information.
Jin J, Cai Y, Yao T, Gottschalk AJ, Florens L, Swanson SK, Gutierrez JL, Coleman MK, Workman JL, Mushegian A, Washburn MP, Conaway RC,
Conaway JW. A mammalian chromatin remodeling complex with
similarities to the yeast INO80 complex. J Biol Chem. 2005;280:41207-41212. Abstract
Minakhin L, Semenova E, Liu J,
Vasilov A, Severinova E, Gabisonia T, Inman R, Mushegian A, Severinov K. Genome Sequence and Gene Expression of
Bacillus anthracis Bacteriophage Fah. J Mol Biol. 2005;354:1-15. Abstract
Kozbial PZ, Mushegian AR. Natural
history of S-adenosylmethionine-binding proteins. BMC
Struct Biol. 2005;5:19. Abstract
Mushegian AR.
Protein content of minimal and ancestral ribosome. RNA. 2005;11.
Abstract
Return to Research Faculty
|