The development of several high-throughput Omics fields (genomics, transcriptomics, proteomics, metabolomics) over the past few years has already resulted in a wealth of data on these various stages in the flow of biological information (from genes to proteins, and on to their metabolic functions). Yet this large amount of information is also rather challenging; indeed, making sense of such volumes of data is no longer straightforward. Initial data processing to obtain results, along with the required quality control of these results, has to be automated. Additionally, storage and retrieval of multi-experiment data also requires a specific informatics infrastructure. On a more global level, dissemination of the (published) data to the scientific community also requires the construction of publicly accessible, domain-specific repositories. Finally, integration of the various results obtained across the different domains remains very much an ongoing research effort in the life sciences.
Throughout the technological maturation process of the Omics fields, one of the key roles of bioinformatics has been to analyze the information after it was obtained from an experiment: the so-called data-driven approach. As the various fields have matured, however, targeted methodologies that start with more focused questions are again gaining prominence. Accordingly, bioinformatics analyses have to morph into computational planning approaches, where the brunt of the informatics effort is expended prior to running the experiment.
This exciting transition is in turn a prerequisite to collecting sufficiently comprehensive and reliable data to allow the fine-tuning of systems biology models of reactions and pathways. Indeed, modeling efforts today are in part restricted by the limited amount of available data, resulting in poor coverage of the genes, proteins, or metabolites involved. Other aspects of the models, such as catalogs of protein-protein interactions, often have to deal with the converse problem of sometimes noisy data.
We therefore focus on three key points, aimed at enabling systems biology modeling:
- Data collection and integration across the various Omics fields
- (Semi-)automatic quality control of the obtained data using configurable expert systems
- Development of computational Omics to help set up and guide experiments based on a user-supplied list of target entities
ProteomeXchange provides globally coordinated proteomics data submission and disseminationVizcaino J, Deutsch E, Wang R, Csordas A, Reisinger F, Rios D, Dianes J, Sun Z, Farrah T, Bandeira N, Binz P, Xenarios I, Eisenacher M, Mayer G, Gatto L, Campos A, Chalkley R, Kraus H, Albar J, Martinez-Bartolomé S, Apweiler R, Omenn G, Martens L, Jones A, Hermjakob HNATURE BIOTECHNOLOGY, 32, 223-6, 2014 The first comprehensive and quantitative analysis of human platelet protein composition allows the comparative analysis of structural and functional pathwaysBurkhart J, Vaudel M, Gambaryan S, Radau S, Walter U, Martens L, Geiger J, Sickmann A, Zahedi RBLOOD, 120, e73-82, 2012 mzML--a Community Standard for Mass Spectrometry DataMartens L, Chambers M, Sturm M, Kessner D, Levander F, Shofstahl J, Tang W, Rompp A, Neumann S, Pizarro A, Montecchi-Palazzi L, Tasman N, Coleman M, Reisinger F, Souda P, Hermjakob H, Binz P, Deutsch EMOLECULAR & CELLULAR PROTEOMICS, 10, R110 000133, 2011
PhD: Ghent University, Ghent, Belgium, 2006
Postdoc: EMBL-EBI, Cambridge, UK, 2006-09
VIB Group leader since October 2009