The development of several high-throughput Omics fields (genomics, transcriptomics, proteomics, metabolomics) over the past few years has already resulted in a wealth of data on these various stages in the flow of biological information (from genes to proteins, and on to their metabolic functions). Yet this large amount of information is also rather challenging; indeed, making sense of such volumes of data is no longer straightforward. Initial data processing to obtain results, along with the required quality control of these results, has to be automated. Additionally, storage and retrieval of multi-experiment data also requires a specific informatics infrastructure. On a more global level, dissemination of the (published) data to the scientific community also requires the construction of publicly accessible, domain-specific repositories. Finally, integration of the various results obtained across the different domains remains very much an ongoing research effort in the life sciences.
Throughout the technological maturation process of the Omics fields, one of the key roles of bioinformatics has been to analyze the information after it was obtained from an experiment: the so-called data-driven approach. As the various fields have matured, however, targeted methodologies that start with more focused questions are again gaining prominence. Accordingly, bioinformatics analyses have to morph into computational planning approaches, where the brunt of the informatics effort is expended prior to running the experiment.
This exciting transition is in turn a prerequisite to collecting sufficiently comprehensive and reliable data to allow the fine-tuning of systems biology models of reactions and pathways. Indeed, modeling efforts today are in part restricted by the limited amount of available data, resulting in poor coverage of the genes, proteins, or metabolites involved. Other aspects of the models, such as catalogs of protein-protein interactions, often have to deal with the converse problem of sometimes noisy data.
We therefore focus on three key points, aimed at enabling systems biology modeling:
- Data collection and integration across the various Omics fields
- (Semi-)automatic quality control of the obtained data using configurable expert systems
- Development of computational Omics to help set up and guide experiments based on a user-supplied list of target entities