The recent completion of the human genome project underlines the need for new computational and theoretical tools in modern biology. The tools are essential for analyzing, understanding and manipulating the detailed information on life we now have at our disposal.

Problems in computational molecular biology vary from understanding sequence data to the analysis of protein shapes, prediction of biological function, study of gene networks, and cell-wide computations.

Cornell has a university-wide plan in the science of genomics; the Department of Computer Science is playing a critical role in this initiative. Researchers in the computer science department are engaged in a wide range of computational biology projects, from genetic mapping, to advanced sequence analysis, fold prediction, structure comparison algorithms, protein classification, comparative genomics, and long-time simulation of protein molecules.

Faculty and Researchers

Carla Gomes works on solutions to hard combinatorial problems, with an emphasis on planning and scheduling problems, combining techniques fromm Computer Science (CS), Artificial Intelligence (AI), and Operations Reserach (OR). Her research is leading to the creation of the new field of computational sustainability, which develops and applies computational methods to enable a sustainable environment, economy and society.

Alon Keinan studies how human genetic variation has arisen from evolutionary history, develops theoretical tools, and applies them to genomic data sets, bridging theoretical population genetics and empirical studies.

David Shmoys is studying approximate algorithms for genetic linkage mapping (identifying the locations of markers on the genome) to reduce the cost of wet lab experiments and improve the accuracy of the resulting maps.

Adam Siepel has worked on various problems in computational biology, including the detection of recombinant viruses, the reconstruction of evolutionary histories based on genome rearrangements, and the integration of heterogeneous bioinformatics software tools. His most recent work has been in comparative genomics, particularly of mammals, and has included a mixture of statistical modeling, algorithms development, software implementation, and scientific discovery. Adam likes to tackle problems of practical importance in genomics, such as gene finding and conserved element identification, using methods from machine learning and statistics. He is an active participant in several large-scale comparative genomics projects, including the Mammalian Gene Collection project, the ENCODE project, and the Rhesus Macaque Sequencing and Analysis Consortium.

Amy Williams' research interests are algorithms for large scale genetic data, inference of demographic history, population and individual relationships, and studies of recombination and mutation.

Haiyuan Yu performs research research in the broad area of Biomedical Systems Biology with both high-throughput experimental (see Yu et al., Science 2008) and integrative computational (see Wang et al., Nature Biotechnology 2012) methodologies, aiming to understand gene functions and their relationships within complex molecular networks and how perturbations to such systems may lead to various human diseases. The complexity of biological systems calls for building experimentally-verified computational models based on high-quality large-scale datasets, which is truly the future of biomedical research and the main theme of the lab.


A new graduate program in Computational Molecular Biology that crosses colleges was initiated with the participation of the computer science field.