Discovering Biomarkers — The Imprints of Health, Disease, and Performance

Predicting Disease Diagnosis, Speed Of Aging, And Human Performance

Feb 10, 2024

An ask: If you liked this piece, I’d be grateful if you’d consider tapping the “heart” ❤️ in the header above. It helps me understand which pieces you like most and supports this newsletter’s growth. Thank you!

🧬 Discovering Biomarkers — The Imprints of Health, Disease, and Performance

At the time of writing this, in early 2024, the quest for new biomarkers and biomarker measurement techniques is just getting started. Between rapid improvements in hardware, a greater consumer demand for health monitoring, and widespread adoption of machine learning-based approaches to biomarker discovery, the landscape of biomarker usage is shifting rapidly under our feet.

As a co-founder at NNOXX, I have a unique perspective. My team and I spent most of 2021-2022 developing the first wearable device to measure muscle oxygenation and nitric oxide levels non-invasive and in real-time, and throughout 2023, we saw this technology get adopted by NHL, NBA, FIFA, MLB, and Olympic teams, multiple university and private research labs, CrossFit Games athletes, and UFC fighters.

In my team's case at NNOXX, we knew what biomarkers we wanted to measure. Our challenge was figuring out how to do it, which required the coordinated efforts of optical engineers, physician-scientists, exercise physiologists, industrial designers, and software developers.

Now, imagine a different scenario where you start with an outcome goal, like predicting how patients will respond to a specific cancer treatment, but the ideal biomarker remains elusive. This is the challenge faced by cancer researchers, toxicologists, and bioinformaticians. Unlike in the aforementioned engineering-oriented endeavors, these scientists find themselves in a position where new biomarkers must be created or discovered to unravel the mysteries of disease and treatment response.

In this article, we delve into the diverse world of biomarkers, exploring their significance, discovery process, and the techniques employed in their identification. Let's dive in!

🧬 What Are Biomarkers? And What Forms Can They Take?

Biomarkers are well-defined in fields like physiology and exercise science. We have blood lactate, VO2, heart rate, muscle oxygenation, and more. However, the term biomarker takes on a more ambiguous connotation in fields like bioinformatics, where they are used to diagnose diseases and predict treatment efficacy.

In bioinformatics, a biomarker is defined as an indicator of a biological state, often in response to an intervention or disease state. These biomarkers may include phenotypes, physiological states, certain gene variants, and mRNA transcript levels. For example, hemoglobin A1C (HbA1C) has traditionally been used as a biomarker for diabetes because it reflects long-term blood glucose levels. When glucose circulates in the blood, a portion of it irreversibly binds to hemoglobin, and the quantity of this glycated hemoglobin is proportional to the average blood glucose concentration over the lifespan of red blood cells. As a result, HbA1C strongly predicts whether an individual is diabetic or pre-diabetic, as well as their five-year risk for developing type-2 diabetes, as demonstrated in the image below. However, diseases like cancer are much more complex, and as a result, it is not yet possible to predict someone’s risk or response to treatment with high fidelity based on a single biomarker.

Absolute risk of developing type-2 diabetes within 5 years based on HbA1c levels (source)

Gene Expression Signatures As Biomarkers

An emerging discipline in bioinformatics involves using next-generation sequencing technologies to provide massive gene expression data sets, enabling data-driven biomarker discovery. In a paper titled, Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response, the investigators report that gene expression signatures, like the geometric average of RNA expression levels of the genes GZMA and PRF1, serve as biomarkers in predicting outcomes for patients undergoing immune checkpoint blockade (ICB) therapy1. Specifically, this metric quantifies cytolytic (i.e., cell-killing) activity, which often correlates with increased expression of antigen-presenting genes. Since tumors with higher antigen presentation are more likely to express immune checkpoint molecules, such as PD-1 or CTLA-4, making them more susceptible to checkpoint inhibitors, biomarkers such as those mentioned above work well for predicting ICB therapy success, assuming certain conditions are met.

In addition to gene expression levels, metrics like tumor mutation burden expressed as the number of mutations per mega-base of DNA can also be used as biomarkers to stratify patients' responses (with varying success) to treatments such as anti-PD-1 therapy, as demonstrated in the image below:

Correlation between tumor mutational burden and objective response rate with anti-PD-1 or anti-PD-L1 therapy in 27 tumor types (source).

Microbiota Signatures, Liquid Biopsy, and Imaging As Biomarkers

Other promising sources of data propelling cutting-edge biomarker discovery include microbiota signatures, liquid biopsy, and imaging modalities such as fMRI. For example, it's now possible to assess the microbiome composition in a stool sample and use the presence of certain bacterial strains as biomarkers of disease treatment efficacy, as has been done with Bacteroides fragility and thetaiotaomicron, both of which predict patients' response to anticancer immunotherapy drugs blocking CTLA-42.

Additionally, data from liquid biopsies, which can assay measures of circulating tumor DNA (ctDNA), have been used as biomarkers for tumor burden, as demonstrated in the image below. Notably, these liquid-biopsy-based biomarkers provide more sensitive methods of detecting certain types of malignancies than conventional approaches, making them of great interest.

Clinical application of ctDNA-based biomarker assessments (source)

It’s even possible to use fMRI-based imaging data to predict things like clinical dementia ratings, as demonstrated in a previous Decoding Biology article titled, Predicting Clinical Dementia Ratings From fMRI Data.

Biological Networks As Biomarkers

At this point, I’ve discussed multiple methods for measuring biomarkers, each with its benefits and tradeoffs. Now, what if we could use multiple biomarkers and biomarker measurement modalities in a single model for diagnosing disease or predicting treatment outcomes? In recent years, machine learning has been gaining rapid adoption in biomarker discovery using multi-modal data-based approaches. This has not only improved our ability to detect diseases in early stages, diagnose patients, and improve treatment outcomes — it’s also opened the door to using biological networks as biomarkers3. In essence, network analysis examines relationships between various disease features, where nodes represent disease-related phenotypes, level of gene expression, SNPs, blood-based metabolites, imaging features, and more. Describing the interrelationships between these different biomarkers helps uncover the intricate mechanisms of complex diseases by capturing how different biomarkers may influence one another causally. While the idea of networks as biomarkers is yet to become widely adopted, it holds immense potential and is likely to become increasingly sophisticated in the coming years.

🧬 Real-World Applications Of Biomarkers

Biomarker discovery stands at the forefront of transformative breakthroughs in research and biotechnology, offering valuable insights across various domains. From diagnosing diseases to optimizing athletic performance, tailoring drug responses, and unraveling the mysteries of aging, biomarkers serve as beacons guiding researchers and practitioners toward more precise, effective, and personalized approaches to health-care and well-being. Let's explore how biomarkers are transforming our approach to these critical areas.

Disease Diagnosis and Precision Medicine

In the realm of disease diagnosis, biomarkers play a pivotal role in identifying and characterizing conditions such as cancer. For example, it's now possible to identify specific genetic mutations or protein markers in blood samples, which can aid in early cancer detection, enabling timely intervention and improved treatment outcomes. At the time of my writing this, one interesting company working in this space is Grail, which is developing an early cancer screening test using liquid biopsy for people without symptoms.

In addition to helping diagnose diseases, biomarkers empower precision medicine by predicting how patients will respond to specific treatments. Genetic markers, protein expressions, or metabolic signatures can indicate the likelihood of a positive response to a particular drug. This personalized approach minimizes trial and error, allowing physicians to tailor treatments based on a patient's unique biological profile. Additionally, biomarker-driven strategies enhance treatment efficacy, reduce side effects, and pave the way for more targeted and efficient therapeutic interventions.

Anti-Aging and Gerontology

Understanding the speed of aging is a complex but critical aspect of gerontology and anti-aging research. Biomarkers, ranging from telomere length to DNA methylation patterns, provide valuable insights into the aging process at a molecular level. Researchers leverage these biomarkers to assess the impact of different treatments or lifestyle interventions on aging. By monitoring changes in these molecular signatures over time, scientists gain a deeper understanding of how various factors influence the aging trajectory, paving the way for interventions that promote healthier aging and longevity. While I’m not aware of companies commercializing this technology at the moment, there’s been a lot of talk about DunnedinPACE, a DNA methylation biomarker for the speed of aging, as of late, which, to the best of my knowledge, was popularized by public figures such as Bryan Johnson who are using it to gauge how different dietary interventions, supplements, and exercise routines impact their speed of aging.

Injury Prevention and Sports Performance

Biomarkers take center stage in human performance, where metrics like VO2, lactate levels, and muscle oxygenation offer a comprehensive view of an athlete's physiology and aid in personalizing training programs and predicting performance outcomes. This is an area near and dear to me, as a co-founder at NNOXX, where my team and I are currently collaborating with labs and organizations using our technology for wide-ranging applications such as assessing how different garments impact energy expenditure, developing individualized protocols to improve heat tolerance, and producing time to exhaustion in real-time.

In addition to personalizing training programs, quantifying fitness levels, and predicting performance, biomarker measurements such as muscle oxygenation are commonly used to reduce injury risk in athletes. By assessing the balance between oxygen delivery and consumption in muscles and asymmetries between contralateral muscles, practitioners can identify potential issues and tailor training regimens to mitigate injury risks.

🧬 A Framework For Biomarker Discovery And Predictive Modeling

So far, we've explored what biomarkers are and the different forms they can take, the mechanics of biomarker discovery, and their diverse applications across research and biotech. Now, let's dive into a practical DIY guide for uncovering new biomarkers from multi-modal data.

Before discovering new biomarkers or creating predictive models using biomarker data, you must first establish a clear outcome goal. Whether your goal is to diagnose disease, predict treatment outcomes, or enhance athletic performance, you need to ensure your data is aligned with your objective. The quality and relevance of your data can make or break your biomarker discovery endeavor. As the saying goes, garbage in, garbage out. Opting for high-quality, relevant, and ecologically valid will make all your efforts much more likely to succeed.

In this guide, we unveil a framework for biomarker discovery and predictive modeling, illuminating each step from data loading to making predictions.

Step 1: Loading Data

The journey to biomarker discovery begins with data – the raw material from which insights are forged. Loading data into our analysis environment marks the inception of our quest for biomarkers. The data sources for biomarker development are as varied as the outcomes they aim to predict, ranging from raw sequencing data to wearable device metrics, blood-based biomarkers, and gene expression patterns.

Step 2: Data Cleaning and Feature Engineering

After loading your data, the next steps are data cleaning and feature engineering. Think of data cleaning and feature engineering as subtraction and addition. Data cleaning is the process of identifying and correcting errors in your data. It’s akin to polishing a gemstone, removing imperfections to reveal its brilliance. Through meticulous scrutiny, errors are identified, outliers are flagged, and anomalies are rectified.

Feature Engineering is about using domain-specific knowledge to ‘engineer’ input data (i.e., features) to enhance a model’s performance. This could mean creating a composite feature that uniquely combines two existing features or using two current features to calculate a new third feature. In both cases, the goal is to add layers of complexity, crafting new dimensions from the raw data to empower our models with richer insights.

Step 3: Transforming Data

Many machine learning algorithms perform better when numerical input variables in your dataset are scaled to a standard range. For example, algorithms that use a weighted sum of input variables, such as linear regression, logistic regression, and deep learning, are impacted by the scale of input data. Additionally, algorithms that rely on distance measures between samples, such as k-nearest neighbors and support vector machines, are also impacted.

Data transformation is a process used to change a given dataset's scale or distribution of features. The two primary techniques used to change the scale of data are standardization transforms and normalization transforms, which I've discussed in a previous article titled, An In-Depth Look At Data Preparation For Machine Learning. Changing the scale or distribution of numerical variables within a dataset is akin to calibrating the instruments of a finely tuned machine. While many algorithms thrive on standardized input variables, the impact of scaling is not universal. Understanding when to apply scaling – and when to refrain – is paramount, as missteps can lead to unforeseen consequences, such as impaired model performance.

Step 4: Feature Selection and Dimensionality Reduction

In the realm of biomarker discovery, less is often more. Feature selection sifts through the noise, retaining only the most salient variables in a dataset. Meanwhile, dimensionality reduction compresses the data landscape, distilling complexity into a manageable form. Together, these techniques prune the data forest, revealing the signal amidst the noise.

Certain algorithms perform poorly when input variables are irrelevant to the output (i.e., target) variable. In these cases, feature selection identifies input variables (i.e., features) that are most relevant to a given problem and removes input variables that lack relevance. Dimensionality Reduction, on the other hand, is a data preparation technique that compresses high-dimensional data into a lower-dimensional space while preserving the integrity of the data, reducing the number of input variables in the process. You can learn how to choose and program varying feature selection and dimensionality reduction techniques in a previous article, An In-Depth Look At Data Preparation For Machine Learning.

Step 5: Algorithm Spot-Checking

There are many choices when selecting the ideal machine learning algorithm for a given problem. Unfortunately, we can't always know which algorithm will work best on a given dataset dataset beforehand. As a result, we have to try several algorithms and then focus our attention on those that seem most promising. Thus, it's important to have quick and easy ways to assess and compare different algorithms' performance before we select one to tune and optimize - this is where spot-checking comes in. Spot-checking is a way to quickly discover which algorithms perform well on your machine-learning problem before selecting one to commit to. You can learn more about algorithm spot-checking in a previous article titled Comparing Machine Learning Models.

Step 6: Hyperparamter Optimization

You can think of machine learning algorithms as systems with various knobs and dials, which you can adjust in any number of ways to change how output data (i.e., predictions) are generated from input data. The knobs and dials in these systems can be subdivided into parameters and hyperparameters. Parameters are model settings that are learned, adjusted, and optimized automatically. Conversely, hyperparameters must be manually set. Tuning hyperparameters has known effects on machine learning algorithms. However, it's not always clear how to best set a hyperparameter to optimize model performance for a specific dataset. As a result, search strategies are often used to find optimal hyperparameter configurations, which you can learn more about in a previous article titled, An Introduction To Hyperparameter Optimization.

Step 7: Workflow Automation and Validation

After optimizing your model's hyperparameters, the next step is to create a pipeline and deploy your model. Pipelines codify experimentally derived best practices and automate the workflow to produce your machine-learning model. Machine learning pipelines consist of multiple sequential steps that do everything from data extraction and preprocessing to model training and deployment.

The biggest benefit of creating a pipeline is that deploying your model in the wild is much easier. In practice, this could mean using your model in a small pilot study to test its efficacy, testing it on previously unseen data to validate its predictions, or open-sourcing the model so new users can pressure test it in unexpected ways.

Your path after model deployment will largely vary based on your goals. For example, are you creating a new biomarker to predict how patients respond to cancer treatment, or are you trying to quantify endurance performance — the table stakes can be worlds apart, resulting in different approaches. However, in both cases, you'll want to validate your model with real-world data, understand its limitations, and derive a path for step-wise improvements over time.

🧬 Case Study: Predicting Critical Power

Critical power is a concept from exercise physiology and sports science that represents the highest sustainable power output that an individual can maintain for an extended period without fatigue setting in. Critical power represents the boundary between power outputs that can be sustained indefinitely and those that can only be maintained for a limited time before fatigue occurs, and it is often used in endurance sports, such as cycling, running, rowing, and swimming, as a reference point for pacing strategies and training prescription. Athletes can use their critical power to optimize their training intensity and duration, ensuring that they train at an appropriate level of intensity to improve their performance while avoiding overexertion and fatigue.

However, critical power is notoriously challenging to measure accurately, often requiring multiple grueling lab tests, including maximal effort exercise bouts lasting two to thirty minutes. During these tests, an athlete's power output and the duration of the effort are recorded. By plotting the power-duration relationship on a graph, with power on the y-axis and time on the x-axis, critical power can be identified as the asymptote of the curve representing the highest sustainable power output.

My goal is to see if we can use a single, time-efficient test to accurately predict an athlete's critical power, offering invaluable insights for athletes, coaches, and physiologists alike. But here's the kicker: the techniques we'll explore aren't limited to the realm of human performance. The same principles and methods explored below can be applied to gene expression and blood biomarker data, empowering researchers to predict disease risk or treatment response with unprecedented precision.

For my pilot protocol, I had 25 athletes of mixed sex and varying fitness levels, from recreational exercisers to elite athletes, perform a 60-second maximal effort Echo bike sprint. Subjects were instructed to ramp the bike to the highest wattage possible within the first 10 seconds of the sprint, then maintain as high an exertion level as possible for the remainder of the workout. During the aforementioned protocol, subjects' muscle oxygenation (SmO2), nitric oxide (NO), acceleration, and skin temperature were recorded with an NNOXX wearable device. Additionally, the subjects' peak power output was captured from the Echo Bike's interval computer, and their total work in kilojoules was calculated following the workout. Below, you'll find summary statistics for all of the data collected, followed by a breakdown of what each data point means:

Weight: The subjects body weight in pounds.
Max SmO2 is the subject’s baseline muscle oxygen level prior to exercise, Min SmO2 is subject’s minimum muscle oxygen level during their maximal effort sprint, and SmO2 Differential (Diff) is difference between the subjects maximum and minimum SmO2 level.
Max NO is subject’s highest recorded nitric oxide level during their maximal effort sprint, Min NO is subject’s baseline nitric oxide level prior to exercise, and NO Differential (Diff) is the difference between the subjects maximum and minimum NO level.
Peak Acceleration is highest acceleration recorded with the NNOXX wearable during the subjects maximal effort sprint and Peak Power is the highest power output recorded with the Echo Bike’s internal computer during the subjects maximal effort sprint.
Total Work is the amount of work in Kj that was performed by the subject during their maximal effort sprint.
Peak Skin Temperature: The highest skin temperature reading in degrees celsius recorded with the NNOXX wearable during the subjects maximal effort sprint
Critical Power: The subjects critical power, which was calculated through a series of five maximal effort time to fatigue trials. CP is the output that we are trying to predict in our model.

We can visualize the correlations between the varying data points with the heat map below:

After normalizing the data above, I performed recursive feature elimination, a feature selection method that eliminates the least valuable features in a dataset one by one until a specified number of strong features remain. Recursive feature elimination is a popular feature selection technique because it’s easy to configure and understand the results, and it’s effective for selecting input features that are most relevant for predicting the target variable. Using recursive feature elimination, I determined that SmO2_Diff, Max_NO, NO_Diff, Peak_Power, and Total_Work were the five input variables that provided the highest predictive value for critical power.

I then spot-checked a handful of regression algorithms to discover which best predicted subject critical power using k-fold cross-validation as my evaluation method with mean squared error as my evaluation metric and ended up with the following results:

Based on these results, Elastic Net Regression performed the best, with a mean squared error of 320. In practice, the average difference between a subject's predicted critical power and their actual critical power is 17 watts, which is quite an impressive performance. Now, it's possible to eke out a bit better performance by optimizing by hyperparameters. However, given the small data set size (n=25), deploying this model in the real world with a more diverse set of subjects, exercise modalities, and environmental conditions would be more beneficial. Then, after collecting more data, I could repeat the steps above, seeing which features best predict performance (or whether we can engineer new features), spot-checking different algorithms, and tuning the hyperparameters accordingly.

If you liked this piece, I’d be grateful if you’d consider tapping the “heart” ❤️ in the header above. It helps me understand which pieces you like most and supports this newsletter’s growth. Thank you!

Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med. 2018. PMID: 30127393

Anticancer immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science. 2015. PMID: 26541610

Networks as Biomarkers: Uses and Purposes. Genes . 2023. PMID: 36833356