Below are some of the current CBB students and their research.
Song is studying how outbred stock mice with pedigree information contribute to admixture mapping.
Rotations
| Kei-Hoi Cheung | Web service technology to interoperate biological databases and analyze gene clustering |
| Mark Gerstein | Statistical methods for preprocessing and scoring tiling microarray data |
| Hongyu Zhao | Statistical issues in mapping quantitative trait loci for gene expression levels |
Publications:
Kevin is developing a method to accurately determine atomic coordinates for backbone atoms from low resolution RNA crystal structures. To do this, he is using both a reduced representation of RNA developed by the Pyle lab and RNA backbone rotamer library developed by the Richardson lab at Duke University. His goal is to get accurate information about the reduced representation from the electron density, and then determine the appropriate rotamer from this reduced representation data.
Rotations
| Anna Pyle | Comparing and analyzing two methods to examine RNA backbone structure: a reduced representation and a rotamer library |
| Mark Gerstein | Analyzing protein hinges, or large inter-domain motions in proteins |
| Kevin White | Attempting to develop microarray probes that could be used across multiple species of Drosophila |
Publications:

Jill's research focuses on the analysis and integration of multiple, large-scale datasets including epigenomic and genomic data from cancer tissues and cell lines. In particular, she is exploring the interplay between DNA methylation and gene expression, and the effect of demethylating chemotherapeutic agents on this relationship. She is interested in combining high-resolution, genome-wide molecular data with clinical information to ask questions with direct translational relevance.
Rotations
| David Tuck | Using in silico modeling of tissue micro-heterogeneity to determine whether interaction between diverse clones of cells can lead to tumorigenesis |
| Paul Lizardi | Studying the mechanisms of isothermal whole-genome amplification using in silico modeling |
| Michael Krauthammer | Quantifying the strength of relationships between pathological terms and genes based on statistics of co-occurence in the literature |
Publications:
In one collaboration, Hugo is looking at the genetic variation effect between different strains of Saccharomyces cerevisiae that will lead to a quantitative difference in the binding of transcription factors. He is also surveying the pseudo genes in meta genomics by investigating their distribution by protein families in different geographical locations in prokaryotes. This project seeks to see how different environments and nutrition factors would affect the quantity of pseudo genes, categorized by their parent protein families. In a third project, he is developing a pipeline system for analyzing motifs from SH3 domains using comparative genomics, structural, and genomic approaches.
Rotations
| Perry Miller | Using Web Ontology Language (OWL) to integrate two neuronal database, CoCoDat and SenseLab |
| Mark Gerstein | Working on microarray data optimization |
| Michael Snyder | Analysis of transcription factors for pseudohyphal growth in different yeast strains |
Publications:
ThaiBinh's work involves mining through biomedical literature in order to map instances of gene strings and diseases to a repository of gene/disease identifiers. The aim of this process is to more easily classify research papers and identify relevant papers for researchers.
Rotations
| Hongyu Zhao | Establishing a database for a large microarray data set used to test the effects of drugs and toxicants on rat organs |
| Michael Krauthammer | Analyzing several methods of term mapping in order to identify terms found in biological abstracts |
| David Tuck | Classification of transcription factors in PubMed abstracts |
Publications:
Laura is using the complementary approaches of statistical methods and pharmacokinetic modeling to explore possible mechanisms underlying alcohol dependence.
Rotations
| Hongyu Zhao | Evaluating the performance of HapGraph, a program which determines the dependence among genetic loci, by testing it on SNP data from the International HapMap Project |
| Joe Chang | Analyzing data from the Multiple Crime Study, which reports on an isolated population in Russia where individuals have committed multiple crimes. Laura used IDB analysis to determine if any of the genetic markers are linked to mental health or behavioral traits. |
| Kenneth Kidd | Evaluation of markers to cluster people into ethnic populations |
Pavi uses microarray datasets from cancer patients to find patterns of drug sensitivity. She uses melanoma datasets to find genes playing an important role in the disease.
Rotations
| Michael Krauthammer | Analyzing protein interaction networks using graph theory algorithms to find subnetworks of disease genes |
| Mark Gerstein | Building a web interface for Primer3, a program that designs primers for PCR |
| Michael Snyder | Creating an abstract framework for tagging experimental data |
Chong currently works on mapping 5' UTR sequences in yeast by processing data from large-scale 5' RACE experiments. The project attempts to find annotation errors in gene translation start codon positions and original sequencing errors. It also works towards the development of a complete map of 5' UTR sequences in all yeast transcripts.
Rotations
| Mark Gerstein | Examined yeast regulatory networks to find the targets of essential transcription factors |
| Michael Snyder | Performed ChIP-chip experiments on yeast Pol-2 transcription factor to find regulatory binding sites. Also produced 3 biological replicates and submitted data into the UCSC database. |
| Hongyu Zhao | Inferred protein-protein interacting domains using high-throughput data from diverse organisms |
Publications:
The focus of Sebastian's research is the design of microarray chips that detect patterns of genomic methylation, e.g. in cancer versus normal tissues. He hopes that his analysis of the data derived from experiments using those chips will help to functionally annotate the uncharted genomic regions, known as the "junk" DNA.
Rotations
| David Tuck | Creating a simulation of the DNA damage response pathway in yeast using differential equations and agent-based frameworks |
| Paul Lizardi | Developing and testing novel microarray normalization methods, analyzing methylation changes across the human genome, and determining the potential role of repetitive elements in the human genome |
| Michael Krauthammer | Created custom gene ontologies based on preprocessed literature from PubMed |
Publications:
Jamie's research has been focused on understanding the targeting mechanisms of activation induced cytosine deaminase (AID), which is responsible for somatic hypermutation in germinal center B-cells. Her lab has recently been working to identify cis-regulatory modules which are responsible for recruiting AID to the immunoglobulin loci and other recently identified genes. The goal is to identify why some genes are targets of AID and others are not and additionally why some of the mutated genes are repaired in an error-free manner as opposed to other genes that are repaired in an error-prone manner.
Rotations
| David Tuck | Investigated the differences between breast cancer subtypes using microarray data |
| Annette Molinaro | Began the initial setup of a data adaptive system for analysis of tissue microarrays |
| Steven Kleinstein | Analyzed mutations occurring in non-immunoglobulin genes |
Publications:
Jia's research has been focused on genome wide association studies. One approach in obtaining a higher power in detecting the statistically significant associations between SNPs (single nucleotide polymorphisms) and disease status is to perform a summary analysis on several combined studies. This approach is referred to as meta-analysis. However, the challenge in meta-analysis is to achieve comparability between studies. Jia's current research involves exploring various possible approaches in performing meta-analysis on combined sets of Crohn's disease case-control studies while incorporating different imputation methods in expanding the sample size. In addition, as part of his research, he is also hoping to find solutions to account for the population structures when combining datasets.
Rotations
| William Jorgensen | 3D-docking a ligand library containing 24,000 ligands into the tautomerase site of Macrophage Migration Inhibitory Factor |
| Hongyu Zhao | Analyzed the data and investigated the function of the p38 pathway at the molecular level |
| Kei Cheung | Implemented a web interface that allows users to upload/convert a tab delimited text file |
The aim of Karen's research is to develop methods for discovering patterns in high-dimensional data, specifically in survival data. She is studying non-parametric algorithms for partitioning observations based on their covariate values with the aim of minimizing the residual sum of squares for each partition. She has extended the partDSA (partitioning Deletion Substitution Addition algorithm) to accommodate censored survival data by implementing the Inverse Probability Censoring weighting scheme.
Rotations
| Paul Lizardi | Studied the basis for hypermethylation and hypomethylation in CpG islands |
| Steven Kleinstein | Model mutations of B-cells using a discrete stochastic model so that the number of mutations in each B-cell could be tracked |
| Annette Molinaro | Focused on the challenge of missing data imputation when employing non parametric search algorithms |
Publications:
Michael works on developing new techniques, algorithms, and software to efficiently handle the complexity of modeling large and multiscale biological systems. His particular emphasis is on stochastically simulating biochemical reaction networks that are generally intractable using traditional simulation methods. He is applying his new techniques to model the bacterial chemotaxis system in order to study how single cells and populations of cells process information and communicate as they navigate complex environments.
Rotations
| Steven Kleinstein | Developed new computational techniques and software to statistically characterize white blood cell trafficking that was imaged in lymph nodes of live mice |
| Michael Snyder | Worked on microarray based experiments to study how differences in transcription factor binding between several strains of yeast affect observed phenotype |
| Thierry Emonet | Created a stochastic model of the bacterial flagellar motor and used it to study how slow fluctuations in the chemotaxis signaling system affect the swimming behavior of single cells |
Publications:
Emmett works on creating system models of breast cancer pathology, with a focus on HER2+ breast cancers. He is currently investigating copy number variations in different patients, as well as HER2+ breast cancer cell lines.
Rotations
| Mark Gerstein | Investigated the related network features of bottlenecks |
| David Tuck | Developed a software tool to help with network analysis |
| Steven Kleinstein | Investigated the combined effects of IFN-Lambda with IFN-alpha or IFN-gamma on IFN-stimulated gene expression and Hepatitis C Virus replication in hepatocytes |
Publications:

Mohamed's research involves computational analysis of the immune system. Specifically, he is studying Immunoglobulin (Ig) receptor sequences and lineage trees.
Rotations
| Kenneth Kidd | Created interactive simulations to model several population genetics principles |
| Steven Kleinstein | Investigated lineage trees of the B-cell populations for various selection values |
| Perry Miller/Hongyu Zhao | Identified significant genes associated with Age-related Macular Degeneration using a simple GWAS analysis |
Publications:
Pedro is studying yeast genes that are essential to its quiescent state by integrating various data sources, such as gene expression studies, sub-cellular protein localization, and protein-protein interaction networks. He is also working on demonstrating the important relationships between quiescence and human neurodegenerative diseases.
Rotations
| Mark Gerstein | The study of neurodegenerative diseases |
| Michael Krauthammer | Focused on the use of text mining to find related articles |
| Michael Snyder | Validated the predictor for identifying genes in yeast that were essential to its G0 (quiescent) state |
Publications:
Ray is working on several projects including: 1) Analyzing results from ChIP-Sequencing experiments to locate novel transcription factor binding site patterns in the human and worm genomes, 2) developing new methods to analyze large datasets inherent in next-generation sequencing and metagenomics experiments, and 3) applying existing biological information/annotations to the results of ChIP-Seq experiments. He has also helped develop a new scoring method for ChIP-Seq data as well as a method to analyze data from barcoded libraries on the Illumina next-generation sequencing platform.
Rotations
| Mark Gerstein | Examined transcription factor binding site patterns in yeast |
| Michael Snyder | Analyzed the data produced by chIP-Seq experiments |
| Perry Miller/Kei Cheung | Explored possible data representation methods/structures for high-throughput, next-generation sequencing data |
Publications:
Rotations
| Michael Snyder | Established three single-stranded cDNA libraries from different human cell lines for high-throughput 454 sequencing |
| Steven Kleinstein | Investigated the migration patterns of B-cells that are entering and existing the germinal center during affinity maturation |
| Mark Gerstein | Mapped the tanscriptome of the human genome using high-throughout sequencing |
Becky is studying comparative genomics among multiple species, including worm, fly, human and yeast. She is reviewing different ortholog resources that are available to help determine a final gene pair list. She is also looking at available tandem mass spectrometry data to try to determine a new way to confirm possible genome annotations.
Rotations
| Michael Krauthammer | Investigated detailed relationships between a gene and disease |
| Mark Gerstein | Examined comparative genomics of functional elements in C. elegans and C. briggsae |
| Michael Snyder | Used the ChIP-seq experimental method in order to identify transcription factor binding sites in C. elegans |
Publications:
Rotations
| Hongyu Zhao | Employed a method that uses both gene expression data and pathway information |
| Michael Snyder | Performed the ChIP procedure for the precipitation of binding sites of RNA polymerase II (pol II), and acetylated Histone IV (Ac-H4) |
| Paul Lizardi | Classified each known SVA sequence in the genome into the consensus sub-family to which it is best aligned |
| Annette Molinaro | Analyzed the lung cancer survival dataset to determine predictors (combination of variables) that determine life expectancy of patients |
Rotations
| Kenneth Kidd | Expanded upon the work by McQuillan et al. by using their method on the large genotyping dataset |
| Mark Gerstein | Examined the change in the numbers of ABC transporters among bacteria under various environmental conditions |
| Steven Kleinstein | Examined the process of affinity maturation of antibodies |
Xiaowei's research focus is genome wide association studies and analysis of next generation sequencing data. She is currently working on a cancer resequencing project to identify the ovarian-cancer-related variations in 3' UTRs and microRNAs. She is also working with Clip-seq data to find the RNAs interacting with protein Lin-28.
Rotations
| Frank Slack | Identification of novel microRNAs and their targets in life span regulation |
| Tae Hoon Kim | Cross species analysis of CTCF-binding sites and CTCF purification |
| Hongyu Zhao | Crohn's disease genome wide association study based on single SNP and haplotype analysis |
| Mark Gerstein | Metagenomic bacteria identification by characteristic 16s rRNA oligonucleotides |
Rotations
| James Noonan | Investigated the regulators to identify enhancers regulating microRNAs |
| David Tuck | Identified RIDGEs in breast cancer using microarray expression data |
| Mark Gerstein | Identified translocations in the human genome with paired-end (PE) sequencing data |
Lucas' current research involves investigating the saturation of human transcription factor binding sites (TFBSes) with chromatin immunoprecipitation sequencing (ChIP-seq) of human cell lines used in the ENCODE project. He has also worked on using expression levels of interacting gene networks to predict prostate cancer phenotypes.
Rotations
| Kei Cheung | Investigated the benefits and challenges involved in producing and maintaining a wiki for high-throughput sequencing (HTS) data |
| Michael Snyder | Performed RNA-Seq on two strains of yeast: Saccharomyces bayanus and Saccharomyces mikitae |
| Mark Gerstein | Investigated gene networks that allow discernment between prostate cancer patients |
Haisu is interested in the analysis of human gene coexpression networks using microarray data. She is also involved in the DREAM4 project to reconstruct gene regulatory networks from simulated steady-state and time–series data.
Rotations
| Steven Kleinstein | Worked on a differential equation model to identify top regulators of IRF2 using time continuous expression data |
| Mark Gerstein | Analyzed pesudogene evolution using a maximum likelihood method |
| Hongyou Zhao | Worked on the construction of gene coexpression networks using GENEVAR data of HapMap samples |
Kelly is looking for micro RNA targets in invasive micropapillary breast cancer. He is developing a new method for gene regulatory network reconstruction to find novel interactions in primitive erythropoesis.
Rotations
| Mark Gerstein | Examined an Alzheimer gene expression set |
| David Tuck | Studied the dynamic bayesian analysis of primitive erythrocyte gene expression |
| Thierry Emonet | The goal was to add functionality for modeling systems with discrete physical compartments |
Rotations
| Michael Snyder | Worked on RNA sequencing technology, and applied this technology in yeast mating mechanisms study and compared transcriptomes among different yeast species |
| James Noonan | Identified conserved elements with primate specific substitutions |
| Mark Gerstein | Studied allele-specific expression by RNA-Seq |
Dissertation: "Informatics Approaches to Translational Research: Management and Analysis of Clinical and High Density Genomic Data"
Dissertation: "Statistical Modeling of Biological Interactions in Eukaryotes using Genomics and Proteomics Data"
Dissertation: "Tiling Microarray Informatics"
Dissertation: "Mapping Regulatory Networks Using Genomic and Proteomic Approaches"
Dissertation: "Genomic Studies on Nuclear Receptor-Mediated Transcriptional Networks in Breast Cancer Cells"
Dissertation: "Mining Biological Complexity: Cross Integration of Large-Scale Metagenomics, Environmental, and Chemical Datasets"
Dissertation: "Integration of Genomic Data to Identify Genes and Pathways Associated with Disease"
Dissertation: "High-throughput Methods in Computer-aided Drug Design Pertaining to Flexibility, Selectivity and Lipophilicity"