Current Students and Research

Below are some of the current CBB students and their research.

Pedro Alves - Gerstein lab (entered Fall 2007)

Rotations

Mark Gerstein The study of neurodegenerative diseases
Michael Krauthammer Focused on the use of text mining to find related articles
Michael Snyder Validated the predictor for identifying genes in yeast that were essential to its gO (quiescent) state
Raymond Auerbach - Gerstein lab/Snyder lab (entered Fall 2007)

Rotations

Mark Gerstein Examined transcription factor binding site patterns in yeast
Michael Snyder Analyzed the data produced by chIP-Seq experiments
Perry Miller/Kei Cheung Explored possible data representation methods/structures for high-throughput, next-generation sequencing data
David Ballard - Zhao lab (entered Fall 2004)
David Ballard

David is developing statistical/computational methods to better identify gene expression quantitative trait loci (eQTL) underlying complex disease. Once the eQTL are identified he wants to work on relating the eQTL to clinical traits in order to establish (causative) gene networks. He is also working on applying biological/pathway information to genome wide association studies. The objective is to develop new methods to prioritize SNPs for selection for association testing using the known biological information.

Rotations

Mark Gerstein Statistical approaches to filtering noise from signal data in microarray experiments
Kevin White Several studies of Drosophila genomes across eight species
Perry Miller Creation of a web application to do single SNP analysis
Jamie Duke - Kleinstein lab (entered Fall 2006)

Jamie’s research has been focused on understanding the targeting mechanisms of activation induced cytosine deaminase (AID), which is responsible for somatic hypermutation in germinal center B-cells. Her lab has recently been working to identify cis-regulatory modules which are responsible for recruiting AID to the immunoglobulin loci and other recently identified genes. The goal is to identify why some genes are targets of AID and others are not and additionally why some of the mutated genes are repaired in an error-free manner as opposed to other genes that are repaired in an error-prone manner.

Rotations

David Tuck Investigated the differences between breast cancer subtypes using microarray data
Annette Molinaro Began the initial setup of a data adaptive system for analysis of tissue microarrays
Steven Kleinstein Analyzed mutations occurring in non-immunoglobulin genes
Tara Gianoulis - Gerstein lab/Snyder lab (entered Fall 2003)
Tara Gianoulis

Tara studies the link between a small molecule's phenotypic effect and its structural characteristics. Specifically, she uses a variety of machine learning technique to find out more about this relationship.

Rotations

Michael Snyder Investigated the binding partners of the putative S. cerevisiae transcription factor Mga1 using chIP chip, an experimental technique that can identify targets of transcription factors
Mark Shlomchik Developed an automated method to construct phylogenetic trees from the sequence data of cells undergoing affinity maturation
Mark Gerstein Measured pathway "disregulation" in microarray data
Lukas Habegger - Gerstein lab/Snyder lab (entered Fall 2007)
Lukas Habegger

Rotations

Michael Snyder Established three single-stranded cDNA libraries from different human cell lines for high-throughput 454 sequencing
Steven Kleinstein Investigated the migration patterns of B-cells that are entering and existing the germinal center during affinity maturation
Mark Gerstein Mapped the tanscriptome of the human genome using high-throughout sequencing
Sujun Hua - White lab (entered Fall 2002)
(completing his work with Kevin White at the University of Chicago)
Sujun Hua

Sujun researches on steroid hormone signaling pathways and breast cancer. Specifically, his research focuses on

  1. identifying components of the estrogen signaling pathways in breast cancer cells
  2. understanding the genetic and epigentic aspects of the hormone reponse network in breast cancer cells
  3. attempting to understand the pathways inolved in breast cancer progression using genome-wide RNAi screening
Song Huang - Zhao lab (entered Fall 2004)

Song is studying how outbred stock mice with pedigree information contribute to admixture mapping.

Rotations

Kei-Hoi Cheung Web service technology to interoperate biological databases and analyze gene clustering
Mark Gerstein Statistical methods for preprocessing and scoring tiling microarray data
Hongyu Zhao Statistical issues in mapping quantitative trait loci for gene expression levels
Jia Kang (entered Fall 2006)

Jia’s research has been focused on genome wide association studies. One approach in obtaining a higher power in detecting the statistically significant associations between SNPs (single nucleotide polymorphisms) and disease status is to perform a summary analysis on several combined studies. This approach is referred to as meta-analysis. However, the challenge in meta-analysis is to achieve comparability between studies. Jia's current research involves exploring various possible approaches in performing meta-analysis on combined sets of Crohn’s disease case-control studies while incorporating different imputation methods in expanding the sample size. In addition, as part of his research, he is also hoping to find solutions to account for the population structures when combining datasets.

Rotations

William Jorgensen 3D-docking a ligand library containing 24,000 ligands into the tautomerase site of Macrophage Migration Inhibitory Factor
Hongyu Zhao Analyzed the data and investigated the function of the p38 pathway at the molecular level
Kei Cheung Implemented a web interface that allows users to upload/convert a tab delimited text file
Kevin Keating - Pyle lab (entered Fall 2004)
Kevin Keating

Kevin is developing a method to accurately determine atomic coordinates for backbone atoms from low resolution RNA crystal structures. To do this, he is using both a reduced representation of RNA developed by the Pyle lab and RNA backbone rotamer library developed by the Richardson lab at Duke University. His goal is to get accurate information about the reduced representation from the electron density, and then determine the appropriate rotamer from this reduced representation data.

Rotations

Anna Pyle Comparing and analyzing two methods to examine RNA backbone structure: a reduced representation and a rotamer library
Mark Gerstein Analyzing protein hinges, or large inter-domain motions in proteins
Kevin White Attempting to develop microarray probes that could be used across multiple species of Drosophila
Hugo Lam - Gerstein lab (entered Fall 2005)
Hugo Lam

In one collaboration, Hugo is looking at the genetic variation effect between different strains of Saccharomyces cerevisiae that will lead to a quantitative difference in the binding of transcription factors. He is also surveying the pseudo genes in meta genomics by investigating their distribution by protein families in different geographical locations in prokaryotes. This project seeks to see how different environments and nutrition factors would affect the quantity of pseudo genes, categorized by their parent protein families. In a third project, he is developing a pipeline system for analyzing motifs from SH3 domains using comparative genomics, structural, and genomic approaches.

Rotations

Perry Miller Using Web Ontology Language (OWL) to integrate two neuronal database, CoCoDat and SenseLab
Mark Gerstein Working on microarray data optimization
Michael Snyder Analysis of transcription factors for pseudohyphal growth in different yeast strains
Karen Lostritto - Molinaro lab (entered Fall 2006)
Karen Lostritto

The aim of Karen’s research is to develop methods for discovering patterns in high-dimensional data, specifically in survival data. She is studying non-parametric algorithms for partitioning observations based on their covariate values with the aim of minimizing the residual sum of squares for each partition. She has extended the partDSA (partitioning Deletion Substitution Addition algorithm) to accommodate censored survival data by implementing the Inverse Probability Censoring weighting scheme.

Rotations

Paul Lizardi Studied the basis for hypermethylation and hypomethylation in CpG islands
Steven Kleinstein Model mutations of B-cells using a discrete stochastic model so that the number of mutations in each B-cell could be tracked
Annette Molinaro Focused on the challenge of missing data imputation when employing non parametric search algorithms
ThaiBinh Luong - Krauthammer lab (entered Fall 2005)
Thaibinh Luong

ThaiBinh’s work involves mining through biomedical literature in order to map instances of gene strings and diseases to a repository of gene/disease identifiers. The aim of this process is to more easily classify research papers and identify relevant papers for researchers.

Rotations

Hongyu Zhao Establishing a database for a large microarray data set used to test the effects of drugs and toxicants on rat organs
Michael Krauthammer Analyzing several methods of term mapping in order to identify terms found in biological abstracts
David Tuck Classification of transcription factors in PubMed abstracts
Laura Mustavich - Kidd lab/Zhao lab (entered Fall 2005)
Laura Mustavich

Laura is using the complementary approaches of statistical methods and pharmacokinetic modeling to explore possible mechanisms underlying alcohol dependence.

Rotations

Hongyu Zhao Evaluating the performance of HapGraph, a program which determines the dependence among genetic loci, by testing it on SNP data from the International HapMap Project
Joe Chang Analyzing data from the Multiple Crime Study, which reports on an isolated population in Russia where individuals have committed multiple crimes. Laura used IDB analysis to determine if any of the genetic markers are linked to mental health or behavioral traits.
Kenneth Kidd Evaluation of markers to cluster people into ethnic populations
Sara Nichols - Jorgensen lab (entered Fall 2003)
Sara Nichols

Sara researches protein structure sampling algorithms for applications such as structure determination and protein-protein or protein-ligand interactions. Her research focuses on methods to explore the energy surface more efficiently than standard Monte Carlo sampling algorithms. She is interested in secondary structure initiation, side chain sampling, and global optimization for protein folding.

Rotations

Mark Gerstein Helping to develop a server to characterize helix-helix interactions in proteins, taking special interest in interactions involving proteins that sit in the lipid bilayer
Andrew Miranker Looking at a small model peptide that aggregates to determine if it aggregates in silico as well as in vitro
Bill Jorgensen Setting up a folding simulation for a small beta hairpin protein using all-atom simulations in implicit water
Rebecca Robilotto - Gerstein lab (entered Fall 2007)
Rebecca Robilotto

Rotations

Michael Krauthammer Investigated detailed relationships between a gene and disease
Mark Gerstein Examined comparative genomics of functional elements in C. elegans and C. briggsae
Michael Snyder Used the ChIP-seq experimental method in order to identify transcription factor binding sites in C. elegans
Jill Rubinstein - Lizardi lab (entered Fall 2004)
(now in MD-PhD program)
Jill Rubinstein

Jill's research focuses on epigenetic markers, such as methylation. In particular, she studies the mapping and analysis of these markers on a genome wide scale. The main goals of this work are:

  1. to explore the potential of epigenetic data to screen for disease susceptibility and predict drug response
  2. to create a tool that helps to explain the epigenetic mechanisms that regulate chromatin conformation and gene expression. This tool uses zinc finger proteins designed to bind to target sequences only where the DNA is open and accessible.

Rotations

David Tuck Using in silico modeling of tissue micro-heterogeneity to determine whether interaction between diverse clones of cells can lead to tumorigenesis
Paul Lizardi Studying the mechanisms of isothermal whole-genome amplification using in silico modeling
Michael Krauthammer Quantifying the strength of relationships between pathological terms and genes based on statistics of co-occurence in the literature
Pavithra Shivakumar - Krauthammer lab (entered Fall 2005)
Pavithra Shivakumar

Pavi works with protein-protein interactions networks to find disease-related clusters/genes within them. She uses methods such as dimensionality reduction or diffusion to analyze these networks. She also works with microarray data from melanoma patients to find related genes.

Rotations

Michael Krauthammer Analyzing protein interaction networks using graph theory algorithms to find subnetworks of disease genes
Mark Gerstein Building a web interface for Primer3, a program that designs primers for PCR
Michael Snyder Creating an abstract framework for tagging experimental data
Chong Shou - Gerstein lab/Snyder lab (entered Fall 2005)
Chong Shou

Chong currently works on mapping 5' UTR sequences in yeast by processing data from large-scale 5' RACE experiments. The project attempts to find annotation errors in gene translation start codon positions and original sequencing errors. It also works towards the development of a complete map of 5' UTR sequences in all yeast transcripts.

Rotations

Mark Gerstein Examined yeast regulatory networks to find the targets of essential transcription factors
Michael Snyder Performed ChIP-chip experiments on yeast Pol-2 transcription factor to find regulatory binding sites. Also produced 3 biological replicates and submitted data into the UCSC database.
Hongyu Zhao Inferred protein-protein interacting domains using high-throughput data from diverse organisms
Michael Sneddon - Emonet lab (entered Fall 2006)
Michael Sneddon

Michael works on developing new techniques, algorithms, and software to efficiently handle the complexity of modeling large and multiscale biological systems. His particular emphasis is on stochastically simulating biochemical reaction networks that are generally intractable using traditional simulation methods. He is applying his new techniques to model the bacterial chemotaxis system in order to study how single cells and populations of cells process information and communicate as they navigate complex environments.

Rotations

Steven Kleinstein Developed new computational techniques and software to statistically characterize white blood cell trafficking that was imaged in lymph nodes of live mice
Michael Snyder Worked on microarray based experiments to study how differences in transcription factor binding between several strains of yeast affect observed phenotype
Thierry Emonet Created a stochastic model of the bacterial flagellar motor and used it to study how slow fluctuations in the chemotaxis signaling system affect the swimming behavior of single cells
Emmett Sprecher - Tuck lab (entered Fall 2006)
Emmett Sprecher

Emmett works on creating system models of breast cancer pathology, with a focus on HER2+ breast cancers. He is currently investigating copy number variations in different patients, as well as HER2+ breast cancer cell lines.

Rotations

Mark Gerstein Investigated the related network features of bottlenecks
David Tuck Developed a software tool to help with network analysis
Steven Kleinstein Investigated the combined effects of IFN-Lambda with IFN-alpha or IFN-gamma on IFN-stimulated gene expression and Hepatitis C Virus replication in hepatocytes
Sebastian Szpakowski - Lizardi lab/Krauthammer lab (entered Fall 2005)
Sebastian Szpakowski

The focus of Sebastian’s research is the design of microarray chips that detect patterns of genomic methylation, e.g. in cancer versus normal tissues. He hopes that his analysis of the data derived from experiments using those chips will help to functionally annotate the uncharted genomic regions, known as the "junk" DNA.

Rotations

David Tuck Creating a simulation of the DNA damage response pathway in yeast using differential equations and agent-based frameworks
Paul Lizardi Developing and testing novel microarray normalization methods, analyzing methylation changes across the human genome, and determining the potential role of repetitive elements in the human genome
Michael Krauthammer Created custom gene ontologies based on preprocessed literature from PubMed
Taiwo Togun - Molinaro lab (entered Fall 2007)
Taiwo Togun

Rotations

Hongyu Zhao Employed a method that uses both gene expression data and pathway information
Michael Snyder Performed the ChIP procedure for the precipitation of binding sites of RNA polymerase II (pol II), and acetylated Histone IV (Ac-H4)
Paul Lizardi Classified each known SVA sequence in the genome into the consensus sub-family to which it is best aligned
Annette Molinaro Analyzed the lung cancer survival dataset to determine predictors (combination of variables) that determine life expectancy of patients
Mohamed Uduman - Kleinstein lab (entered Fall 2006)
Mohammed Uduman

Mohamed’s research involves computational analysis of the immune system. Specifically, he is studying Immunoglobulin (Ig) receptor sequences and lineage trees.

Rotations

Kenneth Kidd Created interactive simulations to model several population genetics principles
Steven Kleinstein Investigated lineage trees of the B-cell populations for various selection values
Perry Miller/Hongyu Zhao Identified significant genes associated with Age-related Macular Degeneration using a simple GWAS analysis
Xiaowei Zhu - Snyder lab (entered Fall 2002)
Xiaowei Zhu

Xiaowei's research focuses on two main areas:

  1. studying transcription factors in yeast, particularly complexes of these factors. He wishes to predict combinatorial modules and further characterize their regulatory role.
  2. data analysis in protein microarrays. He is designing a scaling algorithm for noise subtraction and signal normalization, and also developing data mining techniques for biological networks and disease profiling datasets.

CBB Graduates

Valentin Dinu - PhD, May 2007
Assistant Professor, Biomedical Informatics, Arizona State University

Valentin was the first graduate of the CBB program, receiving his PhD in May, 2007. His research focused on informatics issues involved in the analysis of SNP data as it relates to helping determine the genetic basis of disease. He developed statistical algorithms for association analysis of genomic data focused on pathway-based analysis. Valentin developed and investigated performance of algorithms for pivoting clinical data stored in Entity-Attribute-Value modeled databases. He also investigated genomic coverage and copy number polymorphism capabilities of multiple microarray platforms.

Yin Liu - PhD, December 2007
Assistant Professor, Department of Neurobiology and Anatomy, The University of Texas Medical School at Houston
Adjunct Assistant Professor, Department of Biomedical Engineering, The University of Texas at Austin

Yin worked on developing statistical methods to analyze large-scale genomics and proteomics data and applies these methods to study biological problems. In particular, she developed statistical approaches for the genome-wide protein interaction prediction and protein complex identification in yeast. The protein interaction data and gene expression data from microarray chips have been integrated for signal transduction pathways reconstruction. Yin also worked on the identification of allelic association between genetic variations located on different chromosomes using human HapMap project data.

Tom Royce - PhD, December 2007
Bioinformatics Scientist, Illumina, San Diego, California

Tom studied the Affymetrix and Nimblegen microarray technologies. Specifically, he was attempting to quantify the sources of signal variability within tiling microarray experiments. Through studying tiling microarrays, Tom’s goal was to improve the measurement accuracy of these technologies and broaden the scope of their application.