I recently got involved in a project where the team I'm working with is analyzing molecular, physiological, and behavioral changes in different Alzheimer's disease (AD) mouse models. As a result, I've been spending a meaningful amount of time mining datasets that vary widely in scope and focus. Some studies only examine transcriptomic data (typically bulk RNA-seq from specific brain regions). In contrast, others focus on proteomics, epigenomics, or a combination of different omics types, often alongside cognitive tests or other assessments of disease progression.
The available data I've worked with is diverse, but I've seen some consistent patterns crop up enough to leave me wondering if we're really extracting all of the insights we can from individual experiments. Working on this project has also helped me better appreciate the strengths and limitations of single-omics studies, which will be the focal point of this article, as well as the potential for integrating multiple omics data types in a single experiment to gain deeper insights.
Strengths and Limitations of Single-Omics Studies
In many cases, a single type of omics data is sufficient to address a specific experimental question. For instance, if the primary aim of a study is to investigate gene expression, identify novel transcripts, or analyze alternative splicing, transcriptomic data alone can provide the necessary insights. However, I've observed a tendency in some studies to overinterpret transcriptomic findings. Often, researchers conduct differential expression analyses between a control and experimental group, perform functional enrichment to infer biological processes that are overexpressed in one group versus another, and consider their work complete. This approach, while informative, may offer an incomplete picture of the underlying biology and may lead to incorrect assumptions as to what's really going on.
Take, for instance, my previous project, Investigating Transcriptional Changes in Alzheimer's Disease Using a 3xTg-AD Mouse Model. In this project, I analyzed bulk RNA-sequencing data from the insular cortex of the 3xTg-AD mouse model at five discrete time points. While the analysis uncovered meaningful transcriptional changes and recapitulated cognitive findings reported in the literature, it also raised new questions that could not be answered with transcriptomic data alone. For example, the lack of complementary omics data, such as proteomics, left gaps in understanding how these transcriptional changes translated to functional protein production or cellular behavior, let alone cognitive decline.
This underscores a broader issue: while identifying overexpressed genes or pathways offers valuable clues, such findings are speculative without evidence of corresponding epigenomic, protein, or metabolite-level changes or connections to cognitive deficits and pathophysiological outcomes. To fully understand disease mechanisms, an integrated, multi-omics approach is often essential.
Why Can't We Infer Effects From Causes?
At first glance, transcriptomic data should correlate with proteomic outcomes. After all, mRNA serves as a blueprint for proteins, reflecting the cell's preparatory state for protein synthesis. However, this relationship is far from straightforward.First off, mRNA molecules vary in stability, and their degradation rates can affect protein production. Translation efficiency also differs between mRNA species, meaning that mRNA abundance doesn't directly translate to protein abundance.
Additionally, Proteins often undergo modifications (e.g., phosphorylation, glycosylation) that alter their function, stability, and cellular localization—details transcriptomics cannot capture. These modifications are collectively referred to as post-translational modifications. Finally, transcriptomics captures all RNA types, including non-coding RNAs that regulate gene expression rather than encode proteins. These molecules complicate predictions about proteomic outcomes. Given these complexities, while transcriptomics can offer predictive insights, it cannot replace proteomics or other omics data types. Instead, these data types must complement each other to form a coherent picture.
The Case for Multi-Omics Integration
In Methodologies of Multi-Omics Data Integration and Data Mining, Ning et al. emphasize that relying on a single type of data analysis is inherently limited because it often captures only reactive processes—what is happening in the system—without providing insights into the underlying causes of those processes. By integrating multiple omics layers, researchers can bridge these gaps, enabling them to move beyond surface-level observations to uncover the causal mechanisms driving biological changes. This approach facilitates a deeper understanding of how molecular events interact across different scales, from gene expression to protein function and beyond, ultimately revealing pathways and targets that could be leveraged for therapeutic interventions.
However, merely collecting all omics data at a single time point does not solve the problem. Different omics layers operate on distinct time scales, as discussed in a previous article, Connecting the Dots: A Guide to Multi-Omics Data. For example:
Genomics: Reflects what a cell can do, based on its DNA blueprint.
Epigenomics and Transcriptomics: Capture what the cell is preparing to do by regulating and initiating gene expression.
Proteomics: Shows what the cell has been doing, revealing proteins produced in response to recent needs.
Metabolomics: Provides a snapshot of what the cell is doing right now, reflecting its metabolic state.
Phenomics: Integrates these molecular layers into observable traits, revealing the biological outcomes of underlying processes.
This concept can be summarized with the following figure:
To untangle causality, it’s essential to collect omics data at multiple time points—what we call temporal multi-omics. This approach enables us to link past transcriptomic and epigenomic states to future proteomic changes, connect past transcriptomic and proteomic profiles to current metabolic states, correlate these molecular timelines with phenotypic changes to understand disease progression, and more.
Temporal multi-omics holds immense promise, but it’s not without challenges1. Collecting and integrating data across multiple time points is resource-intensive, both in terms of cost and computational complexity. Additionally, the data’s sheer volume requires advanced tools for integration and analysis. Machine learning and systems biology approaches can help address these challenges, enabling researchers to model the dynamic relationships between omics layers and predict disease trajectories.
Despite these hurdles, temporal multi-omics is already showing its value. For instance, studies in cancer research have used multi-omics time-course data to identify biomarkers of tumor progression, revealing how early transcriptional changes influence later metabolic and phenotypic outcomes. Similarly, in neurodegenerative diseases like AD, temporal multi-omics could illuminate how early epigenomic and transcriptomic disruptions cascade into protein aggregation, metabolic dysfunction, and cognitive decline.
To fully harness the potential of temporal multi-omics we need to optimize data collection by prioritizing key time points to balance depth and feasibility, invest in computational methods for data integration and causal inference, and link findings to clinical outcomes, such as identifying drug targets or biomarkers for early diagnosis. All of these are significant and nontrivial challenges, but by adopting these strategies, temporal multi-omics can provide a clearer, more integrated understanding of disease mechanisms, driving progress in research and treatment.