Liliana FloreaAssistant Professor
McKusick-Nathans Institute of Genetic Medicine
733 N Broadway, BRB 449
Johns Hopkins University School of Medicine
Baltimore, MD 21205
Ph: (443) 287-5624
E-mail: florea [at] jhu.edu
Lab | Publications
Accelerated sequencing of human and other species is creating vast amounts of genomic sequence data that need to be interpreted to determine biologically meaningful features, often by leveraging information from a related genome. We are developing algorithms and tools for aligning and annotating genomic sequences that allow to identify and characterize commonalities and differences at either local or genome-wide scale, to predict genes and other functional elements, and to infer evolutionary relationships beween species by exploring patterns of sequence re-arrangement and variation.
Relevant software: sim4, ESTmapper, ATAC, sim4cc, sim4db, gencomp, Enterix
Alternative splicing, by which genes create multiple mRNA and protein isoforms by selecting different combinations of exons in a cell-specific fashion, can potentially explain how a limited set of genes can produce a large repertoire of proteins, ultimately contributing to the evolution of species. In human and other species, aberrant splicing has been associated with a number of diseases, including cancers. We are developing methods for cataloguing alternative splicing gene variation in the genome of a species or an individual, by combining heterogeneous sequence data produced with various sequencing technologies (conventional Sanger ESTs, and 454 and RNA-seq reads). In parallel, we use our methods to explore regulatory mechanisms that control splicing and to identify markers of diseases.
Relevant software: AIR, sim4, sim4db, ESTmapper, ASprofile, CLASS
RNA editing produces post-transcriptional changes in the sequence of an RNA molecule, which can lead to proteins not encoded in the genome sequence, or alter regulatory and structural motifs. While only two types of changes had been known until recently (A-to-I, mediated by the ADAR proteins, and C-to-T, signature of the APOBEC proteins), next generation sequencing promises to reveal much more variation, but there are also significant challenges to distinguish the true variants from the errors. We develop methods to identify and characterize such RNA-DNA differences from RNA-seq data, and to draw a comprehensive picture of the RNA-editome.
Relevant software: rddChecker