|
We use computational methods to find out something new about the structure, function, and evolution of genes, proteins, and entire genomes. One goal of our research is to develop a framework for the analysis of genome-wide numeric data. These days, we can measure a lot of things, and each gene in a completely sequenced genome can be associated with a series of numbers, for example: the level of transcripts of this gene under various conditions or treatments (expression vector); presence and absence of orthologs of that gene in other complete genomes (phyletic vector); information about proteins that interact with a given gene product (interaction vector); occupancy of all known cellular compartments and sites by the product of a gene (localization vector) and others.
We are studying the ways of measuring the distances between vectors, and also developing procedures for sensitive finding of similar vectors. Now we can find groups of related vectors and study their higher-order organization, such as hierarchical or network properties. Surprisingly, the known biochemical and signaling pathways and multiprotein complexes are almost never found as one homogeneous cluster – whatever the vector space is – and most of them are split into many clusters in each vector space, indicating that pathways are modular and that losses and gene displacements have occurred in the evolution of most pathways.
Another direction of research is comparative analysis of completely sequenced genomes and their gene products. Using the information on sequence and structure similarities in existing proteins, we are trying to reconstruct the pathways and complexes that may have existed in the ancestors of the modern life forms. We also have been involved in several genome annotation projects, such as comparative genomics of DNA-containing bacteriophages, functional annotation of predicted proteins encoded by the genome in sea urchin S. purpuratus, and analysis of repertoire of small RNAs encoded by several eukaryotes.
We are also interested in phylogenetic inference in the presence of massive horizontal gene transfer between genomes. Bacteriophages are a good model to study this difficult problem because of fast sequence drift and lack of omnipresent genes in phage genomes. We compiled the profiles of presence and absence of orthologous genes in completely sequenced phages and used these gene content vectors to infer the evolutionary history of phages with double-stranded DNA genomes. Conflicts between this phylogeny and trees constructed from sequence alignments of phage proteins were exploited to infer specific acts of intergenome gene transfer. Thus, a notoriously reticulate evolutionary history of fast-evolving phages can be reconstructed in considerable detail by quantitative comparative genomics. On a more general note, the high absolute number of acts of horizontal transfer is in sharp contrast with relatively low proportion of genes that have been transferred more than once in their lifetime.
Academic Appointment: Professor, Department of Microbiology, Molecular Genetics & Immunology, The University of Kansas School of Medicine
Selected publications
Book
Mushegian AR. Foundations
of Comparative Genomics. Amsterdam ; Boston:
Academic Press; 2007. Amazon. Bibliographic Record.
Articles
Emmert-Streib F.
The chronic fatigue syndrome: a comparative pathway analysis. J Comput Biol.
2007;14:961-972. Abstract
Emmert-Streib F, Mushegian A. A topological
algorithm for identification of structural domains of proteins. BMC Bioinformatics. 2007;8:237.
Abstract
Zhu D, Li Y, Li H. Multivariate
correlation estimator for inferring functional relationships from replicated
genome-wide data. Bioinformatics. 2007;23:2298-2305. Abstract
Sandell LL, Sanderson BW, Moiseyev G, Johnson T, Mushegian A, Young K, Rey JP, Ma JX, Staehling-Hampton K, Trainor
PA. RDH10 is essential for synthesis of embryonic retinoic acid and is required
for limb, craniofacial, and organ development. Genes Dev. 2007;21:1113-1124. Abstract
Banks CA, Kong SE, Spahr H, Florens L, Martin-Brown S, Washburn MP,
Conaway JW, Mushegian A, Conaway RC.
Identification and Characterization of a Schizosaccharomyces pombe RNA Polymerase II Elongation Factor with
Similarity to the Metazoan Transcription Factor ELL. J Biol Chem.
2007;282:5761-5769. Abstract
Paoletti AC, Parmely TJ, Tomomori-Sato C, Sato S, Zhu D, Conaway RC, Conaway JW, Florens
L, Washburn MP. Quantitative proteomic analysis of distinct mammalian Mediator
complexes using normalized spectral abundance factors. Proc Natl Acad Sci U
S A. 2006;103:18928-18933. Abstract
Dequeant ML, Glynn E, Gaudenz K,
Wahl M, Chen J, Mushegian A, Pourquie O. A complex oscillating network of signaling
genes underlines the mouse segmentation clock. Science.
2006;314:1595-1598. Abstract
Naryshkina T, Liu J, Florens
L, Swanson SK, Pavlov AR, Pavlova NV, Inman R, Minakhin L, Kozyavkin SA,
Washburn M, Mushegian A, Severinov
K. Thermus thermophilus Bacteriophage varphiYS40 Genome and Proteomic
Characterization of Virions. J Mol Biol. 2006;364:667-677.
Abstract
Bradham CA, Foltz KR, Beane WS, Arnone MI, Rizzo F, Coffman JA, Mushegian A, Goel M, Morales J,
Geneviere AM, Lapraz F, Robertson AJ, Kelkar H, Loza-Coll M, Townley IK, Raisch
M, Roux MM, Lepage T, Gache C, McClay DR, Manning G. The sea urchin kinome: A
first look. Dev Biol. 2006;300:180-193. Abstract
Goel M, Mushegian A. Intermediary
metabolism in sea urchin: The first inferences from the genome sequence. Dev
Biol. 2006;300:282-292. Abstract
Florens L, Carozza MJ, Swanson SK, Fournier M, Coleman MK, Workman JL, Washburn MP. Analyzing
Chromatin Remodeling Complexes Using Shotgun Proteomics and Normalized Spectral
Abundance Factors. Methods. 2006;40:303-311. Abstract
The Sea Urchin Genome
Consortium. The genome of the sea urchin
Strongylocentrotus purpuratus. Science.
2006;314:941-952. Abstract
Zybailov B, Mosley AL, Sardiu ME, Coleman MK, Florens L, Washburn MP.
Statistical analysis of membrane proteome expression changes in Saccharomyces
cerevisiae. J Proteome Res. 2006;5:2339-2347.
Abstract
Jaspersen SL, Martin AE, Glazko G,
Giddings TH, Jr., Morgan G, Mushegian A,
Winey M. The Sad1-UNC-84 homology domain in Mps3 interacts with Mps2
to connect the spindle pole body with the nuclear envelope. J Cell Biol.
2006;174:665-675. Abstract
Rai R, Mushegian A,
Makarova K, Kashina A. Molecular dissection of arginyltransferases guided by
similarity to bacterial peptidoglycan synthases. EMBO Rep. 2006;7:800-805. Abstract
Glazko G, Coleman M, Mushegian A.
Similarity searches in genome-wide numerical data sets. Biol
Direct. 2006;1:13. Abstract.
Supplementary Information.
Liu J, Glazko G, Mushegian A. Protein repertoire of double-stranded DNA bacteriophages. Virus Res. 2006;117:68-80. Abstract.
Glynn EF, Chen J, Mushegian AR.
Detecting periodic patterns in unevenly spaced gene expression time series
using Lomb-Scargle periodograms. Bioinformatics.
2006;22:310-316. Abstract.
Supplementary Information.
Jin J, Cai Y, Yao T, Gottschalk AJ, Florens L, Swanson SK, Gutierrez JL, Coleman MK, Workman JL, Mushegian A, Washburn MP, Conaway RC,
Conaway JW. A mammalian chromatin remodeling complex with
similarities to the yeast INO80 complex. J Biol Chem. 2005;280:41207-41212. Abstract
Minakhin L, Semenova E, Liu J,
Vasilov A, Severinova E, Gabisonia T, Inman R, Mushegian A, Severinov K. Genome Sequence and Gene Expression of
Bacillus anthracis Bacteriophage Fah. J Mol Biol. 2005;354:1-15. Abstract
Kozbial PZ, Mushegian AR. Natural
history of S-adenosylmethionine-binding proteins. BMC
Struct Biol. 2005;5:19. Abstract
Mushegian AR.
Protein content of minimal and ancestral ribosome. RNA. 2005;11.
Abstract
Return to Research Team
|